DNA sequences from brassicaceae encoding squalene epoxidase and process of raising squalene levels in plants therewith

ABSTRACT

The invention provides DNA isolated from a plant species of the family Brassicaceae that can be introduced into the genomes of plants to produce genetically-modified plants having higher levels of squalene than the natural plants. The DNA corresponds to squalene epoxidase gene of the same or a related plant, and may have the sequence as shown by SEQ ID NO:1, SEQ ID NO:3, or SEQ ID NO:5, or a sequence having at least 60% identity with such a sequence. The DNA is introduced into the genome in a way that results in down-regulation of an exogenous plant squalene gene to suppress the expression of squalene epoxidase. The invention also relates to a process of producing genetically-modified plants, plasmids and vectors used in the method, genetically-modified plants and seeds thereof and a method of producing squalene from the modified plants.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a National stage application of International Patent Application No. PCT/CA97/00175, filed Mar. 13, 1997, which claims benefit under 119(e) from Provisional Application No. 60/013,340 filed Mar. 13, 1996.

BACKGROUND OF THE INVENTION

1. Field of the Invention

This invention relates to the production of squalene for commercial and industrial uses. More particularly, the invention relates to a process by which natural squalene levels in plants can be increased, and to nucleotide sequences that can be introduced into plants to cause the desired increase, and plasmids, vectors, etc., useful in the process.

2. Description of the Related Art

There is a U.S. $125 million per annum market for squalene, a colourless oil used in the cosmetics and health industries (Kaiya, 1990). Squalene is currently obtained mainly from shark liver, but it also occurs in small quantities in vegetable oils. Squalene extracted from shark liver is declining in supply (Kaiya 1990) and the harvesting of sharks for this purpose is anyway environmentally unfriendly and is becoming less acceptable as environmental concerns increase in society.

Squalene can be extracted from olive oil, although the amounts are not sufficient to supply even the cosmetics market (Bondioli et al. 1992; Bondioli et al. 1993). Squalene could be extracted from other vegetable oils, but the levels of the hydrocarbon in the oil are too low for this to be economically viable. There are at present no Canadian crops used for squalene production. It has been suggested that, if the levels of squalene occurring in oilseeds could be increased, the traditional source of squalene could be replaced by oilseed crops, to the benefit of both the environment and those countries, such as Canada, that grow crops of this kind in abundance. Many vegetable oils undergo deodorization by vacuum distillation as a routine part of refining. Most of the squalene in the oil can be recovered in the deodorizer distillate which is a by-product of this process (Bondioli et al., 1993). Typically, squalene is concentrated more than one hundred fold in the deodorizer distillate relative to the levels in unrefined vegetable oils. For commercial viability, vegetable oil deodorizer distillates should contain at least 5% (w/w) squalene. Currently, soybean and canola deodorizer distillates contain squalene in the 0.1-3% range (Ramamurthi, S., 1994). Consequently, an increase of two-fold or more in the squalene content of these oilseeds could result in commercially viable squalene production from vegetable oils.

It has been shown that in plant cell cultures, squalene accumulates in the presence of squalene epoxidase inhibitors, e.g. allylamines such as terbinafine (Yates et al. 1991). Apparently, much of the squalene produced in plants is converted to the epoxide by squalene epoxidase, and ultimately to plant sterols. In fact, all plant and higher life forms contain squalene and squalene epoxidase genes, but little squalene accumulates in the tissues of such life forms because of the effects of the expressed squalene epoxidase. Therefore, inhibition of the epoxidase gives squalene an opportunity to accumulate. However, there are as yet no commercial processes based on this concept.

A main problem addressed by the inventors of the present invention is therefore to create a plant crop, particularly an oilseed crop, which accumulates squalene in harvestable tissues, such as seeds, at sufficient levels for commercially-viable extraction.

BRIEF SUMMARY OF THE INVENTION

An object of the present invention is to provide new sources of squalene that have the potential to be exploited on a commercial basis to replace conventional commercial sources of squalene.

Another object of the present invention, is to generate squalene-producing plants modified to accumulate squalene in the plant tissue (e.g. in seeds) in sufficient quantities to make the extraction of squalene commercially attractive.

Another object of the invention is to identify squalene epoxidase genes in plants, and to partially or completely neutralise the expression of such genes.

Another object of the invention is to produce DNA clones, constructs and vectors suitable for modifying the genomes of plants to reduce expression of squalene epoxidase.

Yet another object of the invention is to provide a commercial process for producing squalene from plant tissue, especially seeds.

The inventors of the present invention have discovered the DNA sequences of the genes encoding squalene epoxidase (squalene monooxygenase (2,3-epoxidizing); EC 1.14.99.7) from the plants Arabidopsis thaliana (thale cress), and Brassica napus (rapeseed, canola), as well as a second gene from Arabidopsis and one from Ricinus communis (castor), and using this knowledge have developed a process of modifying the genomes of such plants to produce genetically-modified plants which accumulate squalene at higher than natural levels. Moreover, the process may be operated to increase squalene levels in plants using DNA based on squalene epoxidase genes from different but related plants.

According to one aspect of the invention, there is provided an isolated and cloned DNA (polynucleotide) suitable for introduction into a genome of a plant to suppress expression of squalene epoxidase by said plant below natural levels, wherein the DNA has a sequence corresponding at least in part to a squalene epoxidase gene of a plant.

The DNA preferably has a sequence corresponding to all or part of a specific sequence selected from SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:9 and SEQ ID NO:10 (as shown in the following Sequence Listing); or having at least 60% (more preferably at least 700%) homology thereto.

The measure of homology between two DNA (polynucleotide) sequences as used in this specification is the similarity index given by application of the Wilbur-Lipman algorithm of the MEGALIGN® computer program (DNASTAR) in aligning and comparing DNA sequences corresponding to a complete polypeptide coding region using the parameters ktuple=3, gap penalty=3 and window=20.

According to another aspect of the invention, there is provided a process of producing genetically-modified plants having increased levels of squalene in tissues of the plants compared to corresponding wild-type plants, wherein the plant genome is modified to suppress expression of squalene expoxidase by said plant. The genome is modified by introducing at least one exogenous DNA sequence that corresponds, at least in part, to one or more endogenous squalene epoxidase genes of the plant.

The DNA sequence introduced into said plant genome has at least 60%, and more preferably at least 70%, homology to said one or more of the endogenous squalene epoxidase genes, and is preferably all or part of a sequence selected from SEQ ID NO:1, SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:9 and SEQ ID NO:10.

According to yet another aspect of the invention, at least in a preferred form, there is provided a process of producing genetically-modified plants having increased levels of squalene in tissues of the plants compared to corresponding wild-type plants, wherein the plant genome is modified to suppress expression of squalene expoxidase by said plant, raising squalene levels of a plant, by introducing into the genome of the plant a nucleotide sequence that reduces or prevents expression of squalene epoxidase. The DNA introduced into the genome includes a transcriptional promoter and a sequence that when transcribed from the promoter is complementary or antisense to all or part of at least one squalene epoxidase messenger RNA produced by the plant.

The invention also relates to plasmids and vectors used in the processes indicated above, and as disclosed later.

The invention further relates to a genetically-modified plant capable of accumulating squalene at levels higher than the corresponding wild-type plant, produced by a process as indicated above, or a seed of such a plant.

The invention additionally relates to a process of producing squalene, which involves growing a genetically-modified plant as defined above, harvesting the plant or seeds of the plant, and extracting squalene from the harvested plant or seeds.

BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWINGS

FIGS. 1A-1E show the alignment of deduced amino acid sequences of the clones pDR111 (B. napus 111) [SEQ ID NO:4], pDR411 (B. napus 411) [SEQ ID NO:11] and 129F12T7 (Arabidopsis) [SEQ ID NO:2], and of the known squalene epoxidase genes of mouse (DNA Database of Japan D42048) [SEQ ID NO:6], rat (DNA Database of Japan D37920) [SEQ ID NO:7], and baker's yeast (Genbank M64994) [SEQ ID NO:8]; the alignment was done using the MEGALIGN™ program of the LASERGENE™ suite of programs (DNASTAR) using a multiple alignment gap penalty of 20; and

FIGS. 2, 3 and 4 are plasmid maps of three vectors (pSE111A, pSE411A and pSE129A, respectively) produced according to one embodiment of the present invention.

DETAILED DESCRIPTION OF THE INVENTION

General Discussion

The concept underlying the present invention is to identify squalene epoxidase genes of oilseed plants (or possibly other plants, since all plants appear to have genes for the production of squalene, and particularly those plants that are capable of accumulating squalene in their harvestable tissue) and then to use that knowledge to create genetically-modified plants in which the expression of squalene expoxidase is decreased partially or fully compared to the natural level of expression, so that squalene naturally produced by the plants can accumulate in the seeds or other tissue to levels that make extraction commercially attractive.

The approach taken by the inventors of the present invention to identify squalene epoxidase genes of plants was initially to use the DNA sequence of a known squalene epoxidase gene from yeast to identify equivalent genes in suitable plant species, e.g. by heterologous hybridization, on the assumption that all squalene epoxidase genes will have a considerable degree of similarity. Once one or several plant squalene epoxidase genes have been identified in this way, those plant genes can then be used to identify additional squalene epoxidase genes from other plants.

Heterologous Hybridization

Nucleic acid hybridization is a technique used to identify specific nucleic acids from a mixture. Southern analysis is a type of nucleic acid hybridization in which DNA is typically digested with restriction enzymes, separated by gel electrophoresis and bound to a nitrocellulose or nylon membrane. A nucleic acid probe, which is typically radio-labeled or otherwise rendered easily detectable, is hybridized to the bound DNA by exposing it to the membrane-bound DNA under specific conditions and washing any unbound or loosely bound probe away. The location of the bound probe is then detected by autoradiography or other detection method. The location of the bound probe is an indication that DNA sequences that are similar to those in the probe nucleic acid are present. Hybridization may also be done with DNA of clones of a recombinant DNA library, such as a cDNA library, when that DNA has been bound to a membrane after plating the library out (Ausubel et al., 1994). Of course, the method used by the inventors to identify the genes disclosed in the present application may be used to identify equivalent genes from other plants. As noted above, theprocess originally used by the inventors to identify the Arabidopsis gene was based on further analysis of a gene that was tentatively identified from a publicly available database containing partial sequences (Expressed Sequence Tags or EST's) submitted by other workers from randomly chosen (unidentified) gene clones. EST's from other species (such as rice, castor) can also be searched in the same way to find other possible squalene epoxidase genes present in such plants (depending on the more or less accidental sequencing of the desired genes) using the Arabidopsis and B. napus sequences disclosed herein.

The inventors have, for example, found other EST's from plants that have tentatively been identified as squalene epoxidase genes by comparing them to the Ababidopsis and B. napus sequences discussed above. Thus, sequences corresponding to Genbank Accession Numbers T15019 (obtainable from Dr. C. R. Somerville, Carnegie Institution, 290 Panama St., Stanford, Calif. 94305, USA) and W43353 (obtainable from DNA Stock Center, Arabidopsis Biological Resource Center, Ohio State University, 1060 Carmack Road, Columbus, Ohio 43210-1002, USA) have been predicted to correspond to squalene epoxidases genes from Ricinus communis (castor) and Arabidopsis (a second Arabidopsis gene).

Perhaps more importantly, the process by which the B. napus gene was cloned can be used to clone other plant species. The (heterologous hybridization) methods are well known, but the process requires the knowledge and use of the novel plant squalene epoxidase sequences disclosed in this application.

If the hybridization and washing are done under conditions which are considered stringent (e.g., at relatively high temperature and/or low salt and/or high formamide concentration), then the sequences detected generally have a high degree of similarity to the probe nucleic acid. If hybridization and washing are done at lower stringency, then it is possible to detect sequences that are lower in similarity to the probe. Discussions of this detection of similar sequences by hybridization can be found in Beltz et al. (1983) and Yamamoto and Kadowaki (1995). From the point of view of gene cloning, if one obtains a clone for a gene in one organism, one can use low stringency hybridization of the DNA clones corresponding to a related organism to detect the homologous gene sequences of that organism. As mentioned before, the success of this approach depends on the similarity of the sequences of the homologous genes which in turn generally depends on the evolutionary relationship between the organisms.

Once identified, sequenced and cloned, the DNA of suitable plant species may then be modified or manipulated with any technique capable of decreasing the expression of a natural gene based on an isolated DNA clone corresponding, at least in part, to that gene. Suitable methods, at present, include antisense technologies (Bourque, 1995), co-suppression or gene silencing technologies (Meyer, 1995; Stam et al., 1997; Matzke and Matzke, 1995), and ribozyme technologies (Wegener et al. 1994; Barinaga, 1993).

These technologies are discussed in more detail below.

Down-regulation of Gene Expression

General

The activity of a particular enzyme, such as squalene epoxidase, is dependent on, among other things (such as the biochemical environment), the amount of enzyme (usually, and for the sake of this argument, a protein) that is present. The amount of enzyme present depends on the expression of the gene or genes encoding the enzyme of interest. Gene expression usually includes (not necessarily in this order) transcription of DNA to generate RNA, processing of the RNA produced from transcription, transport of RNA to the site of translation, translation of mature messenger RNA into polypeptide, proteolytic processing and folding of the nascent polypeptide, transport of the protein product to various cellular compartments, and post-translational modification of the protein (such as phosphorylation or glycosylation). Any effect or difference in any of the processes involved in gene expression can have an effect on the level of expression of an enzyme encoded by a given gene or genes. Gene expression often varies with cell type, tissue type and developmental stage. Likewise, enzyme levels in different cells and tissues and at different developmental stages varies widely. (For plant nuclear genes, this is often the result of differential transcription.)

Gene expression can also be affected by the breakdown of the gene product, the enzyme, or any of the intermediates in gene expression, such as precursor RNA.

From a genetic engineering point of view, in principle, gene expression can be down-regulated by affecting almost any of the processes involved. For example, although the mechanism is not well established, antisense technology (as discussed below) decreases the amount of translatable messenger RNA (mRNA) in an organism.

A) Antisense Technology

An appropriate antisense technology is disclosed, for example, in U.S. Pat. No. 5,190,931 issued on Mar. 2, 1993 to Masayori Inouye. The disclosure of this patent is incorporated herein by reference. In short, this technology can be used to regulate or inhibit gene expression in a cell by incorporating into the genetic material of the cell a nucleic acid sequence which is transcribed to produce an mRNA which is complementary to and capable of binding to the mRNA produced by the genetic material of the cell. The introduced nucleic acid sequences include equivalents of the gene to be regulated, or parts thereof, oriented in antisense fashion relative to a transcriptional promoter. Thus, the squalene epoxidase sequence, or part thereof, is introduced into the genetic material of the cell as a construct positioned between a transcriptional promoter segment and a transcriptional termination segment. The mRNA produced when the antisense sequences are transcribed binds or hybridizes to the mRNA from the squalene epoxidase gene of interest and prevents translation to a corresponding protein. Therefore, the protein coded for by the gene is not produced, or is produced in smaller quantities than would otherwise be the case. By introducing a gene that has a sequence that is antisense to the natural squalene epoxidase gene in oilseed plants, the epoxidation of squalene can be inhibited or reduced so that squalene accumulates in the plant tissues, especially the seeds, which can then be harvested in the usual way and the squalene extracted using conventional techniques.

In terms of the process of antisense down-regulation of squalene epoxidase genes, for any plant species, it is generally necessary to use a gene from a closely related plant such that the genes are more than about 60%, and preferably about 70%, identical at the DNA level (Murphy, 1996). Thus, homologous (equivalent) genes from the same family of plants, would reasonably be expected to give an antisense effect on any member species of that family. For example, Arabidopsis genes have been found to have antisense effects in B. napus (Murphy, 1996).

The antisense DNA in expressible form may be introduced into plant cells by any suitable transformation technique, e.g. in planta transformation (such as wound inoculation or vacuum infiltration). Transformation may also be carried out by co-cultivation of cotyledonary petioles and hypocotyl explants (e.g. of B. napus and B. carinata) with A. tumefaciens bearing suitable constructs (Moloney et al. (1989) and DeBlock et al.(1989)).

It would, of course, be optimal to identify a natural squalene epoxidase gene for each plant species to be modified in order to ensure complete correspondence of the DNA used to modify the natural gene and the DNA of the natural gene itself. If a gene from one plant species has been cloned, there are methods available to clone the same gene from other plants. The reliability of these methods (heterologous hybridization methods) depends on the similarity of the DNA sequence of the genes. If the DNA sequences have at least 60% of their sequence identical, and more preferably at least 70%, then the methods are usually reliable. Sequence similarity depends mostly on evolutionary (ancestral) relationships between plants. Practically, this means that either of the two genes first cloned by the inventors (the Arabidopsis and B. napus genes) may be used to clone the same gene in any other dicotyledonous plant (dicot), including, but not limited to soybean, tobacco, amaranth, potato, cotton, flax, bean, and pea. It is also reasonable to assume that the Arabidopsis or B. napus genes could also be used to clone the same genes from monocotyledonous plants (monocots), such as wheat, corn and barley.

The antisense effect occurs when hybridization can occur between antisense RNA and native RNA under the conditions prevailing in the cell. This may occur when the antisense RNA (and corresponding cDNA) contains as few as 20 nucleotides. More preferably, however, there should be at least 100 nucleotides in the cDNA to guarantee the required effect, and of course any larger portion up to the entire cDNA may be employed. In short, therefore, for effective antisense technology, the DNA sequence introduced into the plant genome should preferably be at least 20 consecutive nucleotides corresponding the native squalene epoxidase gene, and more preferably between 100 and the full DNA sequence of the gene. The homology of the added sequence may be at least 60%, and more preferably at least 70%, of the native plant gene.

B) Ribozyme Technology

Another method for downregulating gene expression by affecting mRNA levels is ribozyme technology. Ribozymes are RNA molecules capable of catalyzing the cleavage of RNA and other nucleic acids. In nature, Tetrahymena preribosomal RNA, some viroids, virusoids and satellites RNAs of plant viruses perform self-cleavage reactions. The cleavage site for some plant pathogenic RNAs consists of a consensus structure, called the "hammmerhead" motif. The cleavage occurs within this hammerhead 3' to a GUX triplet, where X can be C, U, or A. The nucleotide region directing the catalysis of the cleavage reaction can be separated from the region where the cleavage occurs and the recognition of the target RNA can be modified by changing the nucleotide sequence of the regions flanking the cleavage site. As a consequence, ribozymes can be designed to catalyze cleavage reactions on targeted sequences of separate RNA substrates. This provides a means of regulating gene expression, if the DNA sequence of the gene is known.

In order to genetically engineer the down-regulation of a particular gene in plants, a vector can be constructed for transformation that includes one or more units, each of which may include a transcriptional promoter and a sequence encoding a ribozyme designed to cleave RNA transcribed from the gene or genes of interest. An example of this in plants has been provided by Schreier and co-workers (Steinecke et al. 1992, Wegener et al. 1994) in which a ribozyme was designed against neomycin phosphotransferase mRNA. Separate DNA constructs encoding the ribozyme and the neomycin phosphotransferase (npt) gene were used to transform plants. In plants containing both constructs, a reduction neomycin phosphotransferase activity was observed relative to plants transformed with only the npt gene construct.

Ribozyme technology also appears to be successful in other eukaryotes, such as the fruit fly (Zhao and Pick, 1993).

C) Co-suppression or Homology-Dependent Gene Silencing

When attempts have been made to overexpress homologous genes in plants, often a small fraction of the resulting transgenic plants are found to have very low levels of expression of both the native gene and the introduced gene (transgene). This phenomenon has been called co-suppression or homology-dependent gene silencing (Stam et al. 1996, Matzke and Matzke 1995). The mechanism by which co-suppression occurs is very poorly understood. However, advantage can be taken of the phenomenon to down-regulate the expression of a gene of interest. This can be accomplished by transforming a plant with a DNA construct which contains a strong transcriptional promoter driving the sense transcription of a DNA sequence with high similarity to the gene of interest. For example, when the chalcone synthase gene was introduced into petunia in an attempt to overproduce chalcone synthase (which is involved in flower pigment biosynthesis), some transgenic plants showed pigment patterns and enzyme levels that indicated the suppression of chalcone synthase gene expression (Jorgensen 1990). Investigation of examples such as these has shown that the effect is often associated with repetition of the transgene inserts in the plant genome. Cosuppression may be dependent on the coding region of a gene or on the promoter and other non-coding regions.

Thus, the down-regulation of squalene epoxidase in plants may be engineered with the use of cDNA sequence that are disclosed herein, or with plant genomic sequences which may include the promoter or promoters of squalene epoxidase genes.

D) Other Variations

Variations on the process of increasing squalene in plants include the use of different promoter sequences which may give rise to increased squalene in other tissues and at various stages of development. For example, the use of the cauliflower mosaic virus 35S promoter is likely to have an effect in most plant tissues. Other seed-specific and tissue-specific promoter may also be used.

Also, other plant transformation methods may be used such as the particle gun technique (Christou 1993).

As well, other vectors, selectable markers, transcription terminators, etc., may be used (Guerineau and Mullineaux 1993).

It has already been observed that overexpression of a fragment of the hamster 3-hydroxymethyl-3-glutaryl CoA reductase (HMGR) gene in plants can elevate squalene levels in plants (Chappell et al. 1994). This is likely due to the fact that the level of HMGR limits the flow of carbon through the mevalonate/sterol pathway that includes squalene. It would be expected that a combination of elevated HMGR levels and down-regulated squalene epoxidase levels would have an effect on raising squalene levels that would be larger than the effect of either elevated HMGR alone or down-regulated squalene epoxidase alone.

EXAMPLES Identification of the Squalene Epoxidase Gene

The DNA sequence of the squalene epoxidase gene of yeast was published by Jandrositz et al. (1991). Using the TBLASTN™ computer search program (Altschul et al. 1990) and the yeast squalene epoxidase (predicted) amino acid sequence, the sequence was used to search a database which included partial cDNA sequences called "the Non-Redundant database" maintained by the National Center for Biotechnology Information (NCBI) in the United States. This database is a non-redundant nucleotide database made up of:

pdb Brookhaven Protein Data Bank, April 1994 Release

genbank Genbank® Release 87.0, Feb. 15, 1995

gbupdate Genbank® cumulative updates to genbank major release

embl EMBL data library, Release 41.0, December 1994

emblu E MBL Data Library, cumulative updates to embl major release

maintained by the National Center for Biotechnology Information (NCBI), National Library of Medicine, National Institute of Health, Bethesda, Md. 20894, U.S.A.).

The database included expressed sequence tags (ESTs), i.e. partial sequences of more-or-less randomly chosen cDNA clones. This search identified the Arabidopsis thaliana cDNA clone 129F12T7 (Genbank accession no. T44667) as a putative squalene epoxidase gene. This clone was the seventh highest scoring sequence in this search and the highest scoring plant sequence. The P(N) of 1.9×10⁻⁵ was considered borderline significant. The single high-scoring pair (HSP) of subsequences found was a stretch of 46 nucleotides with 21 positions identical (45%). Searches with the T44667 sequence revealed that a large portion of the 46 nucleotide region (29 nucleotides) matches a sequence motif found in a variety of enzymes that bound adenine dinucleotides, such as flavin adenine dinucleotide (FAD; which at least some squalene epoxidases are known to use as a cofactor; see Wierenga et al. 1986). So, in fact, the search, done when only the partial DNA sequence (T44667) was available, suggested the possibility, but did not confirm that T44667 corresponded to a squalene epoxidase gene.

The 129F12T7 clone was obtained and its DNA sequenced completely by the inventors at the Plant Biotech Institute of the National Research Council of Canada at Saskatoon, Saskatchewan, Canada. The DNA sequence of the cDNA insert of p129F12T7 is shown in the Sequence Listing (see later) as SEQ ID NO:1. After the full sequence of the insert of p129F12T7 was obtained, the Non-Redundant Protein Database (NCBI) was searched using the BLAST™ software (Altschul et al. 1990) (NCBI) based on the predicted amino acid sequence. The amino acid sequence corresponding to the open reading frame of SEQ ID NO:1 are shown in the Sequence Listing as SEQ ID NO:2. The Arabidopsis sequence gave the highest scoring matches with squalene epoxidase sequences including that of rat (P(N)=5×10⁻⁶⁰ ) and yeast (P (N)=9.2×10⁻³³). No sequences which had been reliably identified had P(N) values less than 10⁻⁶. These numbers indicate that the product of the Arabidopsis gene is, in all probability, squalene epoxidase.

The 129F12T7 clone was used to probe a B. napus cDNA library, obtained from Dr. Edward Tsang of the Plant Biotech Institute. Two independent clones, pDR111 and pDR411 were isolated and sequenced. The Sequence Listing shows the DNA sequences of the cDNA inserts of pDR111 [SEQ ID NO:3] and pDR411 [SEQ ID NO:5] and the amino acid sequences corresponding to the coding regions of SEQ ID NO:3 [SEQ ID NO:4] and SEQ ID NO:5 [SEQ ID NO:11]. pDR111 and pDR411 have similar (but not identical) DNA sequences which are also similar to the 129F12T7 sequence. Plasmids p129F12T7, pDR111 and pDR411 were deposited at the American Type Culture Collection (ATCC), 10801 University Boulevard, Manassas, Va. 20110 USA, under the terms of the Budapest Treaty on Jan. 9, 1997 and were accepted. The deposit numbers are, respectively, ATCC 97847, ATCC 97846 and ATCC 97845. A single deposit receipt and statement of viability was issued for all three deposits on Jan. 17, 1997.

FIG. 1 of the accompanying drawings shows an alignment of amino acid sequences for the 129F12T7 clone [SEQ ID NO:2], the pDR111 clone [SEQ ID NO:4] and the pDR411 [SEQ ID NO:11] clone, along with the squalene epoxidase sequences amino acid sequences for mouse [SEQ ID NO:6], rat [SEQ ID NO:7] and yeast [SEQ ID NO:8]. The plant sequence show blocks of high similarity to the non-plant sequences, including the region thought to correspond to an adenine dinucleotide-binding site (residues 45-88 of the Arabidopsis sequence; Wierenga et al. 1986; Sakakibara et al. 1995), as well as in the C-terminal half of the sequence. The amino acid sequence similarities based on this alignment are shown in Table 1 below.

                  TABLE 1                                                          ______________________________________                                         Amino acid sequence similarities                                               calculated by MEGALIGN ™ software for the sequence                          alignment of FIG. 1.                                                                                                  Yeast                                                                          Squa-                                                                          lene                                                                           Epoxi-                                                         Mouse    Rat    dase                                                           Squalene Squalene                                                                              Pre-                                    PDR411       p129F12T7 Epoxidase                                                                               Epoxidase                                                                             dicted                                  Predicted    Predicted Predicted                                                                               Predicted                                                                             Amino                                   Amino        Amino     Amino    Amino  Acid                                    Acid         Acid      Acid     Acid   Se-                                     Sequence     Sequence  Sequence Sequence                                                                              quence                                  ______________________________________                                         pDR111  74.8     59.6      27.0   26.4   21.5                                  Predicted                                                                      Amino                                                                          Acid                                                                           Sequence                                                                       pDR411           62.9      29.2   27.8   21.3                                  Predicted                                                                      Amino                                                                          Acid                                                                           Sequence                                                                       p129F12T7                  27.3   26.1   20.9                                  Predicted                                                                      Amino                                                                          Acid                                                                           Sequence                                                                       Mouse                             91.8   30.4                                  Squalene                                                                       Epoxidase                                                                      Predicted                                                                      Amino                                                                          Acid                                                                           Sequence                                                                       Rat                                      30.4                                  Squalene                                                                       Epoxidase                                                                      Predicted                                                                      Amino                                                                          Acid                                                                           Sequence                                                                       ______________________________________                                    

Analysis of the pDR411 sequence suggests it has an intron in the 3'-end of its amino acid coding region which is, of course, unusual in cDNA. If nucleotides 1473-1629 (inclusive) are removed from the sequence and the cDNA translated, the C-terminus is more similar to the pDR111 and p129F12T7 amino acid sequences [SEQ ID NO:4 and SEQ ID NO:2]. Also, there are sequence patterns in this region that are common to other plant introns (5' and 3' splice consensus sequences and high AT content (Goodall and Filipowicz, 1991)). This may mean that the pDR411 clone represents an intermediate or precursor RNA, rather than the final messenger RNA (mRNA). There can therefore be less certainty in predicting the full amino acid sequence corresponding to pDR411, although this predicted sequence is shown in FIG. 1 [SEQ ID NO:11]. However, the possible presence of a small intron in the 3'-end of pDR411 does not cause a problem for its use in antisense techniques.

Employing the plant squalene epoxidase sequences, transgenic plants can be generated which accumulate squalene in their seeds. This can be done by established genetic transformation methods using DNA constructs that include the napin or other seed-specific promoters (Kridl, 1988; Anonymous, 1995) and fragments of plant squalene epoxidase genes arranged in the antisense orientation. Downregulation of the squalene epoxidase gene in seeds by antisense technology (Inouye, 1990; Bourque, 1995) will prevent the conversion of squalene to squalene expoxide and result in squalene accumulation.

ISOLATION OF SOUALENE EPOXIDASE GENE IN B. NAPUS

The 129F12T7 clone obtained as described above was used to probe for the homologous gene in B. napus as follows.

Unless otherwise noted all molecular biology methods were performed as described in Ausubel et al.(1994).

The Arabidopsis 129F12T7 DNA Probe

The plasmid p129F12T7 was digested with the restriction enzymes Sal I and Not I. The resulting DNA fragments were separated by agarose gel electrophoresis.

The 1.8 kb Sal I/Not I DNA fragment corresponding to the Arabidopsis squalene epoxidase cDNA was purified from a gel band. A radiolabelled DNA probe was prepared by the random priming method and [alpha-32P]-dCTP (deoxycytidine triphosphate).

Library Screening

The probe produced as above was used to screen a B. napus cDNA library, kindly provided by Dr. Edward Tsang of the Plant Biotechnology Institute (Saskatoon, Saskatchewan, Canada). To construct the library, B. napus seedlings (cv. Westar) were grown (on half strength Murashige and Skoog agar (1%) medium supplemented with 1% sucrose) in the dark at 22° C. for two weeks after germination and exposed to light for 24 hours. PolyA+RNA was extracted from the seedlings and first strand cDNA synthesis was primed with an oligo dT/Not I adapter/primer. Sal I adapters were ligated after second strand cDNA synthesis and a library was constructed in Not I/Sal I arms of the LambdaZipLox vector (Life Technologies).

The library was plated using standard methods and the Y1090 strain of E. coli. Approximately 25,000 plaques from the library were plated, lifted onto Hybond®-C nylon membranes (Amersham)and hybridized with the above probe according to the manufacturer's instructions. After two rounds of plaque purification, two independent clones, pDR111 and pDR411 were isolated by in vivo excision.

The p129F12T7, pDR111 and pDR411 clones were sequenced using the PRISM® DyeDeoxy Terminator Cycle Sequencing System (Perkin Elmer/Applied Biosystems) and a Model 373 DNA Sequencer (Applied Biosystems). DNA sequences were assembled and analyzed using the Lasergene® suite of software (DNASTAR, Inc.) and BLAST® and related software of the NCBI.

CONSTRUCTION OF VECTORS FOR PLANT TRANSFORMATION

FIGS. 2, 3 and 4 show three vectors constructed for plant transformation, namely pSE129A, pSE111A and pSE411 A. In these drawings, the following abbreviations are used:

    ______________________________________                                         nosT       3'-terminus of the nopaline synthase gene                           SE129      Sal I/Not I insert of p129F12T7                                     SE111      Sal I/Xba I fragment of the insert of pDR111                        SE411      Sal I/Not I insert of pDR411                                        Napin P    napin gene promoter (Josefsson 1986).                               ______________________________________                                    

All other elements are described by Guerineau and Mullineaux (1993), Thomas et al. (1992) and Beban (1984).

These Plasmids Were Constructed as Follows

pDH1

The plasmid pE35SNT was obtained from Raju Datla (Plant Biotechnology Institute, Saskatoon, Saskatchewan Canada). It contains a double 35S promoter and nopaline synthase (Nos) terminator (Datla, 1992) in pUC19. It was digested with Hind III and Xba I to remove the double 35S promoter. The napin promoter (Josefsson et al. 1987) was isolated from pNap (obtained from Ravi Jain, Plant Biotechnology Institute, Saskatoon, Saskatchewan, Canada) by Hind III and Xba I digestion. The plasmid pDH1 was produced by ligation of the large pE35SNT/Hind III/Xba I fragment and the Hind III/Xba I napin promoter fragment. Thus, pDH1 contained the napin promoter and the Nos terminator between the Hind III and EcoR I sites of the pUC19 vector.

pSE129A

The p129F12T7 plasmid was digested with Pst I and Hind III. The fragment containing the Arabidopsis squalene epoxidase cDNA was ligated to the Pst I- and Hind III-digested vector pTrcHisB (INVITROGEN®) to give the circular plasmid pTrcHis129. pTrcHis129 was digested with Xba I and BamH I and the squalene epoxidase cDNA fragment was ligated into Xba I- and BamH I-digested pDH1. The resulting plasmid pDH129A contained the squalene epoxidase cDNA in antisense orientation downstream from the napin promoter and upstream of the Nos terminator. pDH129A was digested with Hind III and partially digested EcoR I and the fragment containing napin promoter, squalene epoxidase cDNA and Nos terminator was ligated into Hind III- and EcoR I-digested pRD400 (a binary vector for plant transformation containing a gene conferring kanamycin resistance; (Datla et al. 1992)) to give pSE129A.

pSE111A

The pDR111 plasmid was digested with Sma I and Xba I. The fragment containing a B. napus squalene epoxidase cDNA (excluding a small part of the 3' end downstream of the Xba I site) was ligated to the large fragment of Sma I- and XBa I-digested pDH129 vector (containing the napin promoter and Nos terminator) to give the circular plasmid pDH111A. pDH111A contained the squalene epoxidase cDNA in antisense orientation downstream from the napin promoter and upstream of the Nos terminator. pDH111A was digested with Hind III and partially with EcoR I and the fragment containing napin promoter, cDNA and Nos terminator was ligated into Hind III- and EcoR I-digested pRD400 to give pSE111A.

pSE411A

The pDR411 plasmid was digested with Sma I and Xba I. The fragment containing a B. napus squalene epoxidase cDNA was ligated to the large fragment of Sma I- and Xba I-digested pDH129A vector (containing the napin promoter and Nos terminator and excluding the Arabidopsis cDNA sequence) to give the circular plasmid pDH411A. pDH411A contained the squalene epoxidase cDNA in antisense orientation downstream from the napin promoter and upstream of the Nos terminator. pDH111A was digested with EcoR I and partially digested with Hind III and the fragment containing napin promoter, squalene epoxidase cDNA and Nos terminator was ligated into Hind III- and EcoR I-digested pRD400 (Datla et al. 1992) to give pSE411A.

The final vectors pSE129A, pSE111A and pSE411A were deposited on Mar. 5, 1997 under the terms of the Budapest Treaty at the American Type Culture Collection, 10801 University Boulevard, Manassas, Va. 20110, USA; under deposit nos. ATCC 97910, ATCC 97909 and ATCC 97908, respectively). These vectors were introduced into Agrobacterium tumefaciens strain GV3101 (bearing helper plasmid pMP90; Koncz and Schell, 1986) by electroporation.

PLANT GROWTH CONDITIONS

All A. thaliana control and transgenic plants were grown in controlled growth chambers, under continuous fluorescent illumination (150-200 μE m⁻² sec⁻¹) at 22° C., as described by Katavic et al.(1995).

PLANT TRANSFORMATION

The pSE129A construct was tested in A. thaliana by in planta transformation techniques.

Wild type (WT) A. thaliana plants of ecotype Columbia were grown in soil. In planta transformation was performed by vacuum infiltration (Bechtold et al. 1993) with overnight bacterial suspension of A. tumefaciens strain GV3101 bearing helper nopaline plasmid pMP90 (disarmed Ti plasmid with intact vir region acting in trans, gentamycin and kanamycin selection markers; Koncz and Schell (1986)) and binary vector pSE129A.

After infiltration, plants were grown to set seeds (T₁ generation). Dry seeds (T₁ generation of seeds) were harvested in bulk and screened on selective medium with 50 mg/L kanamycin. After two to three weeks on selective medium, surviving seedlings were transferred to soil. Mature seeds from these seedlings (T₂ seeds) were used for squalene analysis. Mature seeds from untransformed wild type (WT) Columbia plants and pRD400 transgenic plants (binary vector pRD400, containing only kanamycin selection marker; Datla et al. 1992) were used as controls in analyses of seed lipids.

Seed Analysis

Seeds were analyzed for squalene levels as follows:

In all steps, care was taken to avoid contamination from external sources, particularly human skin. 5-10 mg of Arabidopsis seeds were weighed and rinsed with hexane to remove any external contamination. 1 ml of 7.5% KOH (in 95% methanol) was added to each sample and 250 ng of squalane were added as internal standard. (Squalane is the hydrogenated form of squalene.) Seeds were homogenized with a Polytron® (Model PRO200, PRO Scientific) at maximum speed for 40 seconds. The head of the Polytron was washed with 1 ml of 7.5% KOH (in 95% methanol) and the wash was pooled with the homogenate. The mixture was incubated at 80° C. for 1 hr, then cooled to room temperature. The mixture was centrifuged at 3000 g for 5 min, and the supernatant was transferred to a fresh tube. One ml of H₂ O and 1.5 ml of hexane were added to the supernatant and, after vortexing, the mixture was centrifuged at 3000 g for 5 minutes. The hexane (top) layer was transferred to another test tube. The aqueous phase was re-extracted with 1.5 ml hexane and the hexane fractions were pooled. The hexane fraction was extracted with 1 ml of water/methanol/KOH (50:50:2) and evaporated under nitrogen. The residue was dissolved in 50 μl of hexane and transferred to an autosampler vial. Gas-liquid chromatography was performed with a DB5 column (J & W Scientific, USA) using the following parameters:

    ______________________________________                                         Column Temperature:                                                                             0-1 min     180° C.                                                     1-16 min    180-280° C.                                                             (linear ramp)                                                      16-30 min   280° C.                                    Injector Temperature         275° C.                                    Detector Temperature         300° C.                                    ______________________________________                                    

Transgenic Results

Seeds from 9 Arabidopsis lines transformed with pRD400 and 55 lines transformed with pSE129A were analyzed for squalene content. Table 2 below shows the results for all of the pRD400 transgenic lines and 4 pSE129A lines.

                  TABLE 2                                                          ______________________________________                                                           Squalene ug/g                                                                             Standard Deviation                                Line   Vector     dry weight of 3 Assays                                       ______________________________________                                         k401   pRD400     4.04       0.5                                               k402   pRD400     4.71       0.16                                              k403   pRD400     4.39       0.34                                              k404   pRD400     4.86       0.75                                              k405   pRD400     3.92       0.92                                              k406   pRD400     4.04       1.68                                              k409   pRD400     5.03       0.85                                              k410   pRD400     6.09       1.22                                              k411   pRD400     4.57       1.26                                              k9     pSE129A    9.96       1.59                                              k12    pSE129A    11.34      2.01                                              k50    pSE129A    12.38      0.35                                              k54    pSE129A    9.76       1.43                                              ______________________________________                                    

The mean and standard deviation of the 9 pRD400 lines is 4.6 and 0.7, respectively.

    __________________________________________________________________________     #             SEQUENCE LISTING                                                 - (1) GENERAL INFORMATION:                                                     -    (iii) NUMBER OF SEQUENCES: 11                                             - (2) INFORMATION FOR SEQ ID NO: 1:                                            -      (i) SEQUENCE CHARACTERISTICS:                                           #pairs    (A) LENGTH: 1756 base                                                          (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: cDNA to mRNA                                         -    (iii) HYPOTHETICAL: NO                                                    -     (iv) ANTI-SENSE: NO                                                      -     (vi) ORIGINAL SOURCE:                                                    #thaliana (A) ORGANISM: Arabidopsis                                                      (B) STRAIN: Columbia                                                 #3 different stagesOPMENTAL STAGE:                                                       (F) TISSUE TYPE: 4 d - #ifferent tissues                             -    (vii) IMMEDIATE SOURCE:                                                             (A) LIBRARY: Lambda-PRL2                                                       (B) CLONE: 129F12T7                                                  -     (ix) FEATURE:                                                                      (A) NAME/KEY: CDS                                                              (B) LOCATION:15..1565                                                          (D) OTHER INFORMATION:/cod - #on.sub.-- start= 15                                   /function=- # "converts squalene to                                            2,3-oxidosqu - #alene"                                                         /EC.sub.-- - #number= 1.14.99.7                                                /product=- # "squalene epoxidase"                                              /standard.sub.-- - #name= "squalene monooxygenase                              (2,3-epoxidi - #zing)"                                          -     (ix) FEATURE:                                                                      (A) NAME/KEY: 3'UTR                                                            (B) LOCATION:1566..1756                                              -     (ix) FEATURE:                                                                      (A) NAME/KEY: polyA.sub.-- - #site                                             (B) LOCATION:1756                                                    -     (ix) FEATURE:                                                                      (A) NAME/KEY: 5'UTR                                                            (B) LOCATION:1..14                                                   #1:   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:                                    - CCACGCGTCC GGCA ATG ACT TAC GCG TGG TTA TGG A - #CG CTT CTC GCC TTT            50                                                                                           Met T - #hr Tyr Ala Trp Leu Trp Thr Leu Leu Ala Ph - #e        #               10                                                             - GTT CTG ACA TGG ATG GTT TTT CAC CTC ATC AA - #G ATG AAG AAG GCG GCA            98                                                                           Val Leu Thr Trp Met Val Phe His Leu Ile Ly - #s Met Lys Lys Ala Ala            #         25                                                                   - ACC GGA GAT TTA GAG GCC GAG GCA GAA GCA AG - #A AGA GAT GGT GCA ACG           146                                                                           Thr Gly Asp Leu Glu Ala Glu Ala Glu Ala Ar - #g Arg Asp Gly Ala Thr            #     40                                                                       - GAT GTC ATC ATT GTT GGG GCG GGT GTT GCA GG - #C GCT TCT CTT GCT TAT           194                                                                           Asp Val Ile Ile Val Gly Ala Gly Val Ala Gl - #y Ala Ser Leu Ala Tyr            # 60                                                                           - GCT TTA GCT AAG GAT GGA CGA CGA GTA CAT GT - #G ATA GAG AGG GAC TTA           242                                                                           Ala Leu Ala Lys Asp Gly Arg Arg Val His Va - #l Ile Glu Arg Asp Leu            #                 75                                                           - AAA GAG CCA CAA AGA TTC ATG GGA GAG CTG AT - #G CAA GCG GGA GGT CGC           290                                                                           Lys Glu Pro Gln Arg Phe Met Gly Glu Leu Me - #t Gln Ala Gly Gly Arg            #             90                                                               - TTC ATG TTA GCC CAG CTT GGC CTC GAA GAT TG - #T TTG GAG GAC ATA GAC           338                                                                           Phe Met Leu Ala Gln Leu Gly Leu Glu Asp Cy - #s Leu Glu Asp Ile Asp            #        105                                                                   - GCA CAA GAA GCG AAG TCC TTG GCA ATA TAC AA - #G GAT GGA AAA CAC GCG           386                                                                           Ala Gln Glu Ala Lys Ser Leu Ala Ile Tyr Ly - #s Asp Gly Lys His Ala            #   120                                                                        - ACA TTG CCT TTT CCA GAT GAC AAG AGT TTT CC - #T CAT GAG CCA GTA GGT           434                                                                           Thr Leu Pro Phe Pro Asp Asp Lys Ser Phe Pr - #o His Glu Pro Val Gly            125                 1 - #30                 1 - #35                 1 -        #40                                                                            - AGA CTC TTA CGT AAT GGT CGG CTG GTA CAA CG - #T TTA CGC CAA AAA GCA           482                                                                           Arg Leu Leu Arg Asn Gly Arg Leu Val Gln Ar - #g Leu Arg Gln Lys Ala            #               155                                                            - GCT TCT CTT AGC AAT GTT CAA TTA GAA GAA GG - #A ACA GTG AAG TCT TTA           530                                                                           Ala Ser Leu Ser Asn Val Gln Leu Glu Glu Gl - #y Thr Val Lys Ser Leu            #           170                                                                - ATT GAA GAA GAA GGA GTG GTC AAA GGA GTG AC - #A TAC AAA AAT AGC GCA           578                                                                           Ile Glu Glu Glu Gly Val Val Lys Gly Val Th - #r Tyr Lys Asn Ser Ala            #       185                                                                    - GGC GAA GAA ATA ACG GCC TTT GCA CCT CTT AC - #T GTC GTA TGC GAT GGT           626                                                                           Gly Glu Glu Ile Thr Ala Phe Ala Pro Leu Th - #r Val Val Cys Asp Gly            #   200                                                                        - TGT TAT TCG AAC CTT CGT CGG TCA CTC GTG GA - #T AAT ACT GAG GAA GTC           674                                                                           Cys Tyr Ser Asn Leu Arg Arg Ser Leu Val As - #p Asn Thr Glu Glu Val            205                 2 - #10                 2 - #15                 2 -        #20                                                                            - CTC TCG TAC ATG GTG GGT TAC GTC ACG AAG AA - #T AGC CGA CTT GAA GAT           722                                                                           Leu Ser Tyr Met Val Gly Tyr Val Thr Lys As - #n Ser Arg Leu Glu Asp            #               235                                                            - CCC CAT AGT CTA CAT TTG ATA TTT TCT AAA CC - #T TTG GTT TGT GTT ATA           770                                                                           Pro His Ser Leu His Leu Ile Phe Ser Lys Pr - #o Leu Val Cys Val Ile            #           250                                                                - TAT CAA ATA ACC AGT GAT GAA GTT CGT TGT GT - #T GCC GAA GTT CCC GCT           818                                                                           Tyr Gln Ile Thr Ser Asp Glu Val Arg Cys Va - #l Ala Glu Val Pro Ala            #       265                                                                    - GAT AGT ATT CCT TCT ATA TCG AAT GGT GAA AT - #G TCT ACC TTC CTC AAG           866                                                                           Asp Ser Ile Pro Ser Ile Ser Asn Gly Glu Me - #t Ser Thr Phe Leu Lys            #   280                                                                        - AAA TCA ATG GCT CCT CAG ATA CCT GAA ACT GG - #A AAT CTT CGG GAG ATA           914                                                                           Lys Ser Met Ala Pro Gln Ile Pro Glu Thr Gl - #y Asn Leu Arg Glu Ile            285                 2 - #90                 2 - #95                 3 -        #00                                                                            - TTT TTG AAA GGC ATA GAG GAA GGA TTA CCA GA - #G ATA AAA TCA ACA GCG           962                                                                           Phe Leu Lys Gly Ile Glu Glu Gly Leu Pro Gl - #u Ile Lys Ser Thr Ala            #               315                                                            - ACG AAA AGT ATG TCA TCG AGA TTG TGT GAT AA - #A AGA GGA GTG ATT GTG          1010                                                                           Thr Lys Ser Met Ser Ser Arg Leu Cys Asp Ly - #s Arg Gly Val Ile Val            #           330                                                                - TTG GGA GAT GCA TTC AAT ATG CGT CAT CCT AT - #A ATC GCG TCA GGA ATG          1058                                                                           Leu Gly Asp Ala Phe Asn Met Arg His Pro Il - #e Ile Ala Ser Gly Met            #       345                                                                    - ATG GTT GCA CTC TCG GAC ATT TGC ATT CTA CG - #C AAT CTT CTC AAA CCA          1106                                                                           Met Val Ala Leu Ser Asp Ile Cys Ile Leu Ar - #g Asn Leu Leu Lys Pro            #   360                                                                        - TTG CCT AAC CTC AGC AAT ACT AAG AAA GTC TC - #T GAT CTT GTC AAG TCC          1154                                                                           Leu Pro Asn Leu Ser Asn Thr Lys Lys Val Se - #r Asp Leu Val Lys Ser            365                 3 - #70                 3 - #75                 3 -        #80                                                                            - TTT TAC ATC ATC CGC AAG CCA ATG TCA GCG AC - #C GTG AAC ACG CTC GCG          1202                                                                           Phe Tyr Ile Ile Arg Lys Pro Met Ser Ala Th - #r Val Asn Thr Leu Ala            #               395                                                            - AGT ATC TTT TCA CAA GTG CTT GTT GCT ACA AC - #A GAC GAA GCA AGA GAG          1250                                                                           Ser Ile Phe Ser Gln Val Leu Val Ala Thr Th - #r Asp Glu Ala Arg Glu            #           410                                                                - GGA ATG CGA CAA GGC TGC TTC AAT TAC CTA GC - #T CGT GGA GAT TTT AAA          1298                                                                           Gly Met Arg Gln Gly Cys Phe Asn Tyr Leu Al - #a Arg Gly Asp Phe Lys            #       425                                                                    - ACA AGG GGA TTG ATG ACT ATT CTC GGA GGC AT - #G AAC CCT CAC CCT CTT          1346                                                                           Thr Arg Gly Leu Met Thr Ile Leu Gly Gly Me - #t Asn Pro His Pro Leu            #   440                                                                        - ACT CTA GTC CTT CAT CTT GTA GCC ATC ACC CT - #T ACG TCC ATG GGC CAC          1394                                                                           Thr Leu Val Leu His Leu Val Ala Ile Thr Le - #u Thr Ser Met Gly His            445                 4 - #50                 4 - #55                 4 -        #60                                                                            - TTG CTC TCT CCG TTT CCT TCG CCT CGT CGC TT - #T TGG CAT AGC CTC AGA          1442                                                                           Leu Leu Ser Pro Phe Pro Ser Pro Arg Arg Ph - #e Trp His Ser Leu Arg            #               475                                                            - ATT CTT GCC TGG GCT TTG CAA ATG TTG GGT GC - #A CAT TTA GTG GAT GAA          1490                                                                           Ile Leu Ala Trp Ala Leu Gln Met Leu Gly Al - #a His Leu Val Asp Glu            #           490                                                                - GGA TTC AAG GAA ATG TTG ATT CCA ACA AAC GC - #A GCT GCT TAT CGA AGG          1538                                                                           Gly Phe Lys Glu Met Leu Ile Pro Thr Asn Al - #a Ala Ala Tyr Arg Arg            #       505                                                                    - AAC TAT ATC GCC ACA ACC ACT GTT TGA TCAATCCAT - #A ACACGAAGAC                1585                                                                           Asn Tyr Ile Ala Thr Thr Thr Val                                                #   515                                                                        - TGTTTTATTC GGAGATGAAA AATAACAACT CAAACAGTTA ACTTTCTACA AC - #CAAATAAA        1645                                                                           - TAATTGTGTG TATATGAAGT TGAGCCTATG GTTAAGCTCT ACTGAATTGT GT - #TGAAAACA        1705                                                                           #           1756TATATGC TAATTTGTTA TATTCTATTT ATTGATTCTT G                     - (2) INFORMATION FOR SEQ ID NO: 2:                                            -      (i) SEQUENCE CHARACTERISTICS:                                           #acids    (A) LENGTH: 516 amino                                                          (B) TYPE: amino acid                                                           (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: protein                                              #2:   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:                                    - Met Thr Tyr Ala Trp Leu Trp Thr Leu Leu Al - #a Phe Val Leu Thr Trp          #                 15                                                           - Met Val Phe His Leu Ile Lys Met Lys Lys Al - #a Ala Thr Gly Asp Leu          #             30                                                               - Glu Ala Glu Ala Glu Ala Arg Arg Asp Gly Al - #a Thr Asp Val Ile Ile          #         45                                                                   - Val Gly Ala Gly Val Ala Gly Ala Ser Leu Al - #a Tyr Ala Leu Ala Lys          #     60                                                                       - Asp Gly Arg Arg Val His Val Ile Glu Arg As - #p Leu Lys Glu Pro Gln          # 80                                                                           - Arg Phe Met Gly Glu Leu Met Gln Ala Gly Gl - #y Arg Phe Met Leu Ala          #                 95                                                           - Gln Leu Gly Leu Glu Asp Cys Leu Glu Asp Il - #e Asp Ala Gln Glu Ala          #           110                                                                - Lys Ser Leu Ala Ile Tyr Lys Asp Gly Lys Hi - #s Ala Thr Leu Pro Phe          #       125                                                                    - Pro Asp Asp Lys Ser Phe Pro His Glu Pro Va - #l Gly Arg Leu Leu Arg          #   140                                                                        - Asn Gly Arg Leu Val Gln Arg Leu Arg Gln Ly - #s Ala Ala Ser Leu Ser          145                 1 - #50                 1 - #55                 1 -        #60                                                                            - Asn Val Gln Leu Glu Glu Gly Thr Val Lys Se - #r Leu Ile Glu Glu Glu          #               175                                                            - Gly Val Val Lys Gly Val Thr Tyr Lys Asn Se - #r Ala Gly Glu Glu Ile          #           190                                                                - Thr Ala Phe Ala Pro Leu Thr Val Val Cys As - #p Gly Cys Tyr Ser Asn          #       205                                                                    - Leu Arg Arg Ser Leu Val Asp Asn Thr Glu Gl - #u Val Leu Ser Tyr Met          #   220                                                                        - Val Gly Tyr Val Thr Lys Asn Ser Arg Leu Gl - #u Asp Pro His Ser Leu          225                 2 - #30                 2 - #35                 2 -        #40                                                                            - His Leu Ile Phe Ser Lys Pro Leu Val Cys Va - #l Ile Tyr Gln Ile Thr          #               255                                                            - Ser Asp Glu Val Arg Cys Val Ala Glu Val Pr - #o Ala Asp Ser Ile Pro          #           270                                                                - Ser Ile Ser Asn Gly Glu Met Ser Thr Phe Le - #u Lys Lys Ser Met Ala          #       285                                                                    - Pro Gln Ile Pro Glu Thr Gly Asn Leu Arg Gl - #u Ile Phe Leu Lys Gly          #   300                                                                        - Ile Glu Glu Gly Leu Pro Glu Ile Lys Ser Th - #r Ala Thr Lys Ser Met          305                 3 - #10                 3 - #15                 3 -        #20                                                                            - Ser Ser Arg Leu Cys Asp Lys Arg Gly Val Il - #e Val Leu Gly Asp Ala          #               335                                                            - Phe Asn Met Arg His Pro Ile Ile Ala Ser Gl - #y Met Met Val Ala Leu          #           350                                                                - Ser Asp Ile Cys Ile Leu Arg Asn Leu Leu Ly - #s Pro Leu Pro Asn Leu          #       365                                                                    - Ser Asn Thr Lys Lys Val Ser Asp Leu Val Ly - #s Ser Phe Tyr Ile Ile          #   380                                                                        - Arg Lys Pro Met Ser Ala Thr Val Asn Thr Le - #u Ala Ser Ile Phe Ser          385                 3 - #90                 3 - #95                 4 -        #00                                                                            - Gln Val Leu Val Ala Thr Thr Asp Glu Ala Ar - #g Glu Gly Met Arg Gln          #               415                                                            - Gly Cys Phe Asn Tyr Leu Ala Arg Gly Asp Ph - #e Lys Thr Arg Gly Leu          #           430                                                                - Met Thr Ile Leu Gly Gly Met Asn Pro His Pr - #o Leu Thr Leu Val Leu          #       445                                                                    - His Leu Val Ala Ile Thr Leu Thr Ser Met Gl - #y His Leu Leu Ser Pro          #   460                                                                        - Phe Pro Ser Pro Arg Arg Phe Trp His Ser Le - #u Arg Ile Leu Ala Trp          465                 4 - #70                 4 - #75                 4 -        #80                                                                            - Ala Leu Gln Met Leu Gly Ala His Leu Val As - #p Glu Gly Phe Lys Glu          #               495                                                            - Met Leu Ile Pro Thr Asn Ala Ala Ala Tyr Ar - #g Arg Asn Tyr Ile Ala          #           510                                                                - Thr Thr Thr Val                                                                      515                                                                    - (2) INFORMATION FOR SEQ ID NO: 3:                                            -      (i) SEQUENCE CHARACTERISTICS:                                           #pairs    (A) LENGTH: 1748 base                                                          (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: cDNA to mRNA                                         -    (iii) HYPOTHETICAL: NO                                                    -     (iv) ANTI-SENSE: NO                                                      -     (vi) ORIGINAL SOURCE:                                                    #napus    (A) ORGANISM: Brassica                                                         (B) STRAIN: Westar                                                   #14 day greening-etiolatedL STAGE:                                                       (F) TISSUE TYPE: hypoco - #tyls                                      -    (vii) IMMEDIATE SOURCE:                                                             (A) LIBRARY: Tsang                                                             (B) CLONE: pDR111                                                    -     (ix) FEATURE:                                                                      (A) NAME/KEY: 5'UTR                                                            (B) LOCATION:1..18                                                   -     (ix) FEATURE:                                                                      (A) NAME/KEY: CDS                                                              (B) LOCATION:19..1575                                                -     (ix) FEATURE:                                                                      (A) NAME/KEY: 3'UTR                                                            (B) LOCATION:1576..1748                                              #3:   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:                                    #GTT TGT TTA CGG       51 GAT ATG GCT TTT GTG GAA                              #Leu Argsp Met Ala Phe Val Glu Val Cys                                         #          525                                                                 - ATG CTA CTT GTC TTC GTA CTG TCT TGG ACG AT - #A TTT CAC GTC AAC AAC            99                                                                           Met Leu Leu Val Phe Val Leu Ser Trp Thr Il - #e Phe His Val Asn Asn            #   540                                                                        - AGG AAG AAG AAG AAG GCG ACG AAG TTG GCG GA - #T CTG GCT ACT GAG GAG           147                                                                           Arg Lys Lys Lys Lys Ala Thr Lys Leu Ala As - #p Leu Ala Thr Glu Glu            545                 5 - #50                 5 - #55                 5 -        #60                                                                            - AGA AAA GAA GGT GGC CCT GAC GTC ATA ATA GT - #C GGA GCT GGA GTG GGC           195                                                                           Arg Lys Glu Gly Gly Pro Asp Val Ile Ile Va - #l Gly Ala Gly Val Gly            #               575                                                            - GGC TCA GCT CTC GCC TAT GCT CTT GCT AAG GA - #C GGG CGT CGA GTA CAT           243                                                                           Gly Ser Ala Leu Ala Tyr Ala Leu Ala Lys As - #p Gly Arg Arg Val His            #           590                                                                - GTG ATA GAA AGA GAC ATG AGA GAG CCA GTG AG - #A ATG ATG GGT GAG TTC           291                                                                           Val Ile Glu Arg Asp Met Arg Glu Pro Val Ar - #g Met Met Gly Glu Phe            #       605                                                                    - ATG CAG CCA GGA GGA CGG CTC ATG CTT TCT AA - #G CTC GGT CTT CAA GAT           339                                                                           Met Gln Pro Gly Gly Arg Leu Met Leu Ser Ly - #s Leu Gly Leu Gln Asp            #   620                                                                        - TGT TTA GAG GAA ATA GAC GCA CAG AAA TCC AC - #C GGC ATA AGA CTT TTT           387                                                                           Cys Leu Glu Glu Ile Asp Ala Gln Lys Ser Th - #r Gly Ile Arg Leu Phe            625                 6 - #30                 6 - #35                 6 -        #40                                                                            - AAG GAC GGA AAA GAA ACT GTC GCA TGT TTT CC - #G GTG GAC ACC AAC TTT           435                                                                           Lys Asp Gly Lys Glu Thr Val Ala Cys Phe Pr - #o Val Asp Thr Asn Phe            #               655                                                            - CCT TAT GAA CCA TCT GGT CGA TTT TTT CAC AA - #T GGC CGT TTT GTC CAG           483                                                                           Pro Tyr Glu Pro Ser Gly Arg Phe Phe His As - #n Gly Arg Phe Val Gln            #           670                                                                - AGA CTG CGC CAA AAG GCC TCT TCT CTT CCC AA - #T GTG CGG CTG GAA GAA           531                                                                           Arg Leu Arg Gln Lys Ala Ser Ser Leu Pro As - #n Val Arg Leu Glu Glu            #       685                                                                    - GGG ACC GTC CGA TCT TTG ATA GAA GAA AAA GG - #A GTG GTC AAA GGA GTG           579                                                                           Gly Thr Val Arg Ser Leu Ile Glu Glu Lys Gl - #y Val Val Lys Gly Val            #   700                                                                        - ACA TAC AAG AAC AGT TCA GGG GAA GAA ACC AC - #A TCA TTT GCA CCT CTC           627                                                                           Thr Tyr Lys Asn Ser Ser Gly Glu Glu Thr Th - #r Ser Phe Ala Pro Leu            705                 7 - #10                 7 - #15                 7 -        #20                                                                            - ACT GTC GTA TGC GAT GGT TGC CAC TCG AAC CT - #T CGT CGC TCT CTA AAT           675                                                                           Thr Val Val Cys Asp Gly Cys His Ser Asn Le - #u Arg Arg Ser Leu Asn            #               735                                                            - GAC AAC AAT GCG GAG GTT ACG GCG TAC GAG AT - #T GGT TAC ATC TCG AGG           723                                                                           Asp Asn Asn Ala Glu Val Thr Ala Tyr Glu Il - #e Gly Tyr Ile Ser Arg            #           750                                                                - AAT TGT CGC CTT GAA CAG CCC GAC AAG TTA CA - #C TTG ATA ATG GCT AAA           771                                                                           Asn Cys Arg Leu Glu Gln Pro Asp Lys Leu Hi - #s Leu Ile Met Ala Lys            #       765                                                                    - CCG TCT TTC GCC ATG TTG TAT CAA GTC AGC AG - #C ACC GAC GTT CGT TGT           819                                                                           Pro Ser Phe Ala Met Leu Tyr Gln Val Ser Se - #r Thr Asp Val Arg Cys            #   780                                                                        - AAT TTT GAG CTT CTC TCC AAA AAT CTT CCT TC - #T GTT TCA AAT GGT GAA           867                                                                           Asn Phe Glu Leu Leu Ser Lys Asn Leu Pro Se - #r Val Ser Asn Gly Glu            785                 7 - #90                 7 - #95                 8 -        #00                                                                            - ATG ACG TCC TTC GTG AGG AAC TCT ATT GCT CC - #C CAG GTA CCT CTA AAA           915                                                                           Met Thr Ser Phe Val Arg Asn Ser Ile Ala Pr - #o Gln Val Pro Leu Lys            #               815                                                            - CTC CGC AAA ACA TTT TTG AAA GGG CTC GAT GA - #G GGA TCA CAT ATA AAA           963                                                                           Leu Arg Lys Thr Phe Leu Lys Gly Leu Asp Gl - #u Gly Ser His Ile Lys            #           830                                                                - ATT ACA CAA GCA AAG CGC ATC CCA GCT ACT TT - #G AGC AGA AAA AAG GGA          1011                                                                           Ile Thr Gln Ala Lys Arg Ile Pro Ala Thr Le - #u Ser Arg Lys Lys Gly            #       845                                                                    - GTG ATT GTG TTG GGA GAT GCA TTC AAC ATG CG - #T CAT CCC GTA ATC GCG          1059                                                                           Val Ile Val Leu Gly Asp Ala Phe Asn Met Ar - #g His Pro Val Ile Ala            #   860                                                                        - TCG GGG ATG ATG GTT TTA TTG TCT GAC ATT CT - #C ATT CTA AGC CGT CTT          1107                                                                           Ser Gly Met Met Val Leu Leu Ser Asp Ile Le - #u Ile Leu Ser Arg Leu            865                 8 - #70                 8 - #75                 8 -        #80                                                                            - CTC AAG CCT TTG GGC AAC CTC GGT GAT GAA AA - #C AAA GTC TCA GAA GTT          1155                                                                           Leu Lys Pro Leu Gly Asn Leu Gly Asp Glu As - #n Lys Val Ser Glu Val            #               895                                                            - ATG AAG TCC TTC TAT GCT CTA CGC AAG CCA AT - #G TCA GCA ACA GTA AAC          1203                                                                           Met Lys Ser Phe Tyr Ala Leu Arg Lys Pro Me - #t Ser Ala Thr Val Asn            #           910                                                                - ACA CTA GGG AAT TCA TTT TGG CAA GTG CTA AT - #T GCT TCA ACG GAC GAA          1251                                                                           Thr Leu Gly Asn Ser Phe Trp Gln Val Leu Il - #e Ala Ser Thr Asp Glu            #       925                                                                    - GCA AAA GAG GCC ATG CGA CAA GGT TGC TTT GA - #T TAC CTC TCT AGT GGT          1299                                                                           Ala Lys Glu Ala Met Arg Gln Gly Cys Phe As - #p Tyr Leu Ser Ser Gly            #   940                                                                        - GGG TTT CGC ACG TCA GGC TTG ATG GCT CTG AT - #T GGT GGC ATG AAC CCT          1347                                                                           Gly Phe Arg Thr Ser Gly Leu Met Ala Leu Il - #e Gly Gly Met Asn Pro            945                 9 - #50                 9 - #55                 9 -        #60                                                                            - AGG CCA CTT TCT CTC TTC TAT CAT CTA TTC GT - #T ATT TCT TTA TCC TCC          1395                                                                           Arg Pro Leu Ser Leu Phe Tyr His Leu Phe Va - #l Ile Ser Leu Ser Ser            #               975                                                            - ATT GGC CAA CTG CTC TCT CCA TTC CCC ACT CC - #T CTT CGT GTT TGG CAT          1443                                                                           Ile Gly Gln Leu Leu Ser Pro Phe Pro Thr Pr - #o Leu Arg Val Trp His            #           990                                                                - AGC CTC AGA CTT CTT GAT TTG TCT TTG AAA AT - #G TTG GTT CCT CAT CTC          1491                                                                           Ser Leu Arg Leu Leu Asp Leu Ser Leu Lys Me - #t Leu Val Pro His Leu            #      10050                                                                   - AAG GCC GAA GGA ATA GGT CAA ATG TTG TCT CC - #A ACA AAT GCA GCG GCG          1539                                                                           Lys Ala Glu Gly Ile Gly Gln Met Leu Ser Pr - #o Thr Asn Ala Ala Ala            #  10205                                                                       - TAT CGC AAA AGC TAT ATG GCT GCA ACC GTT GT - #C TAG ACATTGATGA               1585                                                                           Tyr Arg Lys Ser Tyr Met Ala Ala Thr Val Va - #l                                1025                1030 - #                1035                               - AATATAGATG GTGCACAAAT CTTTGTGATT GTGGATTTGT GAAAATAGTA TT - #GCAATATG        1645                                                                           - TTACTGAAGA AACTTTTCCT TATCCACTTA TAAGTGGAAA TAGGAAGAAT GT - #GTATATAT        1705                                                                           #                 174 - #8AAATAAAA TTAAGAAAAT AAC                              - (2) INFORMATION FOR SEQ ID NO: 4:                                            -      (i) SEQUENCE CHARACTERISTICS:                                           #acids    (A) LENGTH: 518 amino                                                          (B) TYPE: amino acid                                                           (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: protein                                              #4:   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:                                    - Met Asp Met Ala Phe Val Glu Val Cys Leu Ar - #g Met Leu Leu Val Phe          #                 15                                                           - Val Leu Ser Trp Thr Ile Phe His Val Asn As - #n Arg Lys Lys Lys Lys          #             30                                                               - Ala Thr Lys Leu Ala Asp Leu Ala Thr Glu Gl - #u Arg Lys Glu Gly Gly          #         45                                                                   - Pro Asp Val Ile Ile Val Gly Ala Gly Val Gl - #y Gly Ser Ala Leu Ala          #     60                                                                       - Tyr Ala Leu Ala Lys Asp Gly Arg Arg Val Hi - #s Val Ile Glu Arg Asp          # 80                                                                           - Met Arg Glu Pro Val Arg Met Met Gly Glu Ph - #e Met Gln Pro Gly Gly          #                 95                                                           - Arg Leu Met Leu Ser Lys Leu Gly Leu Gln As - #p Cys Leu Glu Glu Ile          #           110                                                                - Asp Ala Gln Lys Ser Thr Gly Ile Arg Leu Ph - #e Lys Asp Gly Lys Glu          #       125                                                                    - Thr Val Ala Cys Phe Pro Val Asp Thr Asn Ph - #e Pro Tyr Glu Pro Ser          #   140                                                                        - Gly Arg Phe Phe His Asn Gly Arg Phe Val Gl - #n Arg Leu Arg Gln Lys          145                 1 - #50                 1 - #55                 1 -        #60                                                                            - Ala Ser Ser Leu Pro Asn Val Arg Leu Glu Gl - #u Gly Thr Val Arg Ser          #               175                                                            - Leu Ile Glu Glu Lys Gly Val Val Lys Gly Va - #l Thr Tyr Lys Asn Ser          #           190                                                                - Ser Gly Glu Glu Thr Thr Ser Phe Ala Pro Le - #u Thr Val Val Cys Asp          #       205                                                                    - Gly Cys His Ser Asn Leu Arg Arg Ser Leu As - #n Asp Asn Asn Ala Glu          #   220                                                                        - Val Thr Ala Tyr Glu Ile Gly Tyr Ile Ser Ar - #g Asn Cys Arg Leu Glu          225                 2 - #30                 2 - #35                 2 -        #40                                                                            - Gln Pro Asp Lys Leu His Leu Ile Met Ala Ly - #s Pro Ser Phe Ala Met          #               255                                                            - Leu Tyr Gln Val Ser Ser Thr Asp Val Arg Cy - #s Asn Phe Glu Leu Leu          #           270                                                                - Ser Lys Asn Leu Pro Ser Val Ser Asn Gly Gl - #u Met Thr Ser Phe Val          #       285                                                                    - Arg Asn Ser Ile Ala Pro Gln Val Pro Leu Ly - #s Leu Arg Lys Thr Phe          #   300                                                                        - Leu Lys Gly Leu Asp Glu Gly Ser His Ile Ly - #s Ile Thr Gln Ala Lys          305                 3 - #10                 3 - #15                 3 -        #20                                                                            - Arg Ile Pro Ala Thr Leu Ser Arg Lys Lys Gl - #y Val Ile Val Leu Gly          #               335                                                            - Asp Ala Phe Asn Met Arg His Pro Val Ile Al - #a Ser Gly Met Met Val          #           350                                                                - Leu Leu Ser Asp Ile Leu Ile Leu Ser Arg Le - #u Leu Lys Pro Leu Gly          #       365                                                                    - Asn Leu Gly Asp Glu Asn Lys Val Ser Glu Va - #l Met Lys Ser Phe Tyr          #   380                                                                        - Ala Leu Arg Lys Pro Met Ser Ala Thr Val As - #n Thr Leu Gly Asn Ser          385                 3 - #90                 3 - #95                 4 -        #00                                                                            - Phe Trp Gln Val Leu Ile Ala Ser Thr Asp Gl - #u Ala Lys Glu Ala Met          #               415                                                            - Arg Gln Gly Cys Phe Asp Tyr Leu Ser Ser Gl - #y Gly Phe Arg Thr Ser          #           430                                                                - Gly Leu Met Ala Leu Ile Gly Gly Met Asn Pr - #o Arg Pro Leu Ser Leu          #       445                                                                    - Phe Tyr His Leu Phe Val Ile Ser Leu Ser Se - #r Ile Gly Gln Leu Leu          #   460                                                                        - Ser Pro Phe Pro Thr Pro Leu Arg Val Trp Hi - #s Ser Leu Arg Leu Leu          465                 4 - #70                 4 - #75                 4 -        #80                                                                            - Asp Leu Ser Leu Lys Met Leu Val Pro His Le - #u Lys Ala Glu Gly Ile          #               495                                                            - Gly Gln Met Leu Ser Pro Thr Asn Ala Ala Al - #a Tyr Arg Lys Ser Tyr          #           510                                                                - Met Ala Ala Thr Val Val                                                              515                                                                    - (2) INFORMATION FOR SEQ ID NO: 5:                                            -      (i) SEQUENCE CHARACTERISTICS:                                           #pairs    (A) LENGTH: 1893 base                                                          (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: cDNA                                                 -    (iii) HYPOTHETICAL: NO                                                    -     (iv) ANTI-SENSE: NO                                                      -     (vi) ORIGINAL SOURCE:                                                              (A) ORGANISM: Bassica n - #apus                                                (B) STRAIN: Westar                                                   #14 day greening-etiolatedL STAGE:                                                       (F) TISSUE TYPE: hypoco - #tyls                                      -    (vii) IMMEDIATE SOURCE:                                                             (A) LIBRARY: Tsang                                                             (B) CLONE: pDR411                                                    -     (ix) FEATURE:                                                                      (A) NAME/KEY: 5'UTR                                                            (B) LOCATION:1..28                                                   -     (ix) FEATURE:                                                                      (A) NAME/KEY: exon                                                             (B) LOCATION:29..1466                                                -     (ix) FEATURE:                                                                      (A) NAME/KEY: intron                                                           (B) LOCATION:1467..1623                                              -     (ix) FEATURE:                                                                      (A) NAME/KEY: exon                                                             (B) LOCATION:1624..1697                                              -     (ix) FEATURE:                                                                      (A) NAME/KEY: 3'UTR                                                            (B) LOCATION:1698..1893                                              #5:   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:                                    - CCACGCGTCC GCGGACGCGT GGGCAGATAT GGATCTAGCT TTTCCGCACG TT - #TGTTTGTG          60                                                                           - GACGCTACTC GCCTTTGTGC TGACTTGGAC AGTGTTCTAC GTCAACAACA GG - #AGGAAGAA         120                                                                           - GGTGGCGAAG TTACCCGATG CGGCGACAGA GGTGAGAAGA GACGGTGATG CT - #GACGTCAT         180                                                                           - CATCGTCGGA GCTGGTGTTG GAGGTTCAGC TCTCGCCTAC GCTCTTGCAA AG - #GATGGGCG         240                                                                           - TCGAGTACAT GTGATAGAGA GGGACATGAG GGAACCAGTG AGAATGATGG GT - #GAATTTAT         300                                                                           - GCAACCCGGT GGACGACTAC TGCTTTCTAA GCTTGGTCTT GAAGATTGTT TG - #GAGGGAAT         360                                                                           - AGATGAACAG ATAGCCACAG GCTTAGCAGT TTATAAGGAC GGACAAAAAG CA - #CTCGTGTC         420                                                                           - TTTTCCAGAG GACAACGACT TTCCTTATGA ACCTACTGGT CGAGCTTTTT AT - #AATGGCCG         480                                                                           - TTTTGTCCAG AGACTGCGCC AAAAGGCTTC TTCGCTCCCC ACTGTACAAC TT - #GAAGAAGG         540                                                                           - GACTGTAAAA TCTTTGATAG AAGAAAAAGG AGTGATCAAA GGAGTGACAT AC - #AAGAATAG         600                                                                           - TGCAGGCGAA GAAACGACTG CATTTGCACC TCTCACAGTG GTATGCGACG GT - #TGCTATTC         660                                                                           - AAACCTTCGT CGGTCTGTTA ACGACAACAA TGCGGAGGTT ATATCGTACC AA - #GTTGGTTA         720                                                                           - CGTCTCAAAG AATTGTCAGC TTGAAGATCC TGAAAAGTTA AAATTGATAA TG - #TCTAAACC         780                                                                           - TTCCTTCACC ATGTTGTATC AAATAAGCAG CACCGATGTT CGTTGTGTTA TG - #GAGATTTT         840                                                                           - CCCCGGCAAT ATTCCTTCTA TTTCAAATGG CGAAATGGCT GTTTATTTGA AA - #AATACTAT         900                                                                           - GGCTCCTCAG GTACCTCCAG AACTCCGCAA AATATTTTTG AAAGGAATTG AT - #GAGGGAGC         960                                                                           - ACAAATTAAA GCGATGCCAA CAAAGAGAAT GGAAGCTACT TTGAGCGAAA AG - #CAAGGAGT        1020                                                                           - GATTGTGTTG GGAGATGCAT TCAACATGCG CCACCCAGCG ATTGCCTCTG GA - #ATGATGGT        1080                                                                           - TGTATTATCT GACATTCTCA TTCTACGCCG CCTTCTCCAG CCATTGCGAA AC - #CTCAGTGA        1140                                                                           - TGCAAATAAA GTATCAGAAG TTATTAAGTC ATTTTATGTC ATCCGAAAGC CA - #ATGTCAGC        1200                                                                           - GACGGTGAAC ACGCTAGGAA ATGCATTTTC TCAAGTGCTA ATTGCATCTA CG - #GACGAAGC        1260                                                                           - AAAAGAAGCG ATGCGACAAG GCTGTTTTGA TTACCTCTCT AGTGGCGGCT TT - #CGCACGTC        1320                                                                           - AGGAATGATG GCTCTGCTCG GTGGCATGAA CCCTCGACCA CTCTCTCTCA TC - #TTTCATCT        1380                                                                           - ATGTGGTATT ACTCTATCCT CCATTGGTCA ACTGCTCTCG CCATTTCCAT CT - #CCTCTTGG        1440                                                                           - CATTTGGCAT AGCCTCAGAC TTTTTGGTGT AAGTCATTAT CTCCCTCCCT AT - #GTTATTTA        1500                                                                           - CATATTTTTC TTTGTGTTAT ATATTTTGTA AATAATTTAC AATTGAATTT TG - #ACATTTTC        1560                                                                           - TTGTTGTTTA TGTGTATGCC TAATTGTCTA TGAAAATGTT GGTTCCTCAT CT - #TAAGGCTG        1620                                                                           - AAGGGGTTAG CCAAATGCTG TCTCCAGCAT ACGCAGCCGC GTATCGCAAA AG - #CTATATGA        1680                                                                           - CCGCAACCGC TCTCTAAGCA TCGATGATAA GAACCGCGAA TGATACTATG AC - #ATATTTGG        1740                                                                           - AGCGCTAGTA TTTTGTGGTT TTGCATCCGT TAAAAATTTA AAATGTGTTG CT - #GTGTGTTT        1800                                                                           - ACTATTATTA GTGTATTACC TGGAAAATAC CCGTGGGTAT ATTCTAAATG TA - #TAAAATAT        1860                                                                           #       1893       ACTC TCCGTTTGGT TGG                                         - (2) INFORMATION FOR SEQ ID NO: 6:                                            -      (i) SEQUENCE CHARACTERISTICS:                                           #acids    (A) LENGTH: 572 amino                                                          (B) TYPE: amino acid                                                           (C) STRANDEDNESS:                                                              (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: protein                                              -     (vi) ORIGINAL SOURCE:                                                              (A) ORGANISM: Mus Muscu - #lus                                                 (B) STRAIN: B6CBA                                                    #6-8 weeks(D) DEVELOPMENTAL STAGE:                                                       (F) TISSUE TYPE: liver                                               -    (vii) IMMEDIATE SOURCE:                                                             (A) LIBRARY: Lambda ZAP - # vector Stratagene catalog #935302                  (B) CLONE: pMMSE-17                                                  -      (x) PUBLICATION INFORMATION:                                                      (A) AUTHORS: Kosuga, K.                                                             Hata, S.                                                                       Osumi, T.                                                       #J.            Sakakibara,                                                                    Ono, T.                                                                   (B) TITLE: Nucleotide s - #equence of a cDNA for mouse               #epoxidase     squalene                                                                  (C) JOURNAL: Biochim. B - #iophys. Acta                                        (D) VOLUME: 1260                                                               (E) ISSUE: 3                                                                   (F) PAGES: 345-348                                                             (G) DATE: 1995                                                       #6:   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:                                    - Met Trp Thr Phe Leu Gly Ile Ala Thr Phe Th - #r Tyr Phe Tyr Lys Lys          #                15                                                            - Cys Gly Asp Val Thr Leu Ala Asn Lys Glu Le - #u Leu Leu Cys Val Leu          #            30                                                                - Val Phe Leu Ser Leu Gly Leu Val Leu Ser Ty - #r Arg Cys Arg His Arg          #        45                                                                    - His Gly Gly Leu Leu Gly Arg His Gln Ser Gl - #y Ala Gln Phe Ala Ala          #    60                                                                        - Phe Ser Asp Ile Leu Ser Ala Leu Pro Leu Il - #e Gly Phe Phe Trp Ala          #80                                                                            - Lys Ser Pro Glu Ser Glu Lys Lys Glu Gln Le - #u Glu Ser Lys Lys Cys          #                95                                                            - Arg Lys Glu Ile Gly Leu Ser Glu Thr Thr Le - #u Thr Gly Ala Ala Thr          #           110                                                                - Ser Val Ser Thr Ser Phe Val Thr Asp Pro Gl - #u Val Ile Ile Val Gly          #       125                                                                    - Ser Gly Val Leu Gly Ser Ala Leu Ala Ala Va - #l Leu Ser Arg Asp Gly          #   140                                                                        - Arg Lys Val Thr Val Ile Glu Arg Asp Leu Ly - #s Glu Pro Asp Arg Ile          145                 1 - #50                 1 - #55                 1 -        #60                                                                            - Val Gly Glu Leu Leu Gln Pro Gly Gly Tyr Ar - #g Val Leu Gln Glu Leu          #               175                                                            - Gly Leu Gly Asp Thr Val Glu Gly Leu Asn Al - #a His His Ile His Gly          #           190                                                                - Tyr Ile Val His Asp Tyr Glu Ser Arg Ser Gl - #u Val Gln Ile Pro Tyr          #       205                                                                    - Pro Leu Ser Glu Thr Asn Gln Val Gln Ser Gl - #y Ile Ala Phe His His          #   220                                                                        - Gly Arg Phe Ile Met Ser Leu Arg Lys Ala Al - #a Met Ala Glu Pro Asn          225                 2 - #30                 2 - #35                 2 -        #40                                                                            - Val Lys Phe Ile Glu Gly Val Val Leu Gln Le - #u Leu Glu Glu Asp Asp          #               255                                                            - Ala Val Ile Gly Val Gln Tyr Lys Asp Lys Gl - #u Thr Gly Asp Thr Lys          #           270                                                                - Glu Leu His Ala Pro Leu Thr Val Val Ala As - #p Gly Leu Phe Ser Lys          #       285                                                                    - Phe Arg Lys Ser Leu Ile Ser Ser Lys Val Se - #r Val Ser Ser His Phe          #   300                                                                        - Val Gly Phe Leu Met Lys Asp Ala Pro Gln Ph - #e Lys Pro Asn Phe Ala          305                 3 - #10                 3 - #15                 3 -        #20                                                                            - Glu Leu Val Leu Val Asn Pro Ser Pro Val Le - #u Ile Tyr Gln Ile Ser          #               335                                                            - Ser Ser Glu Thr Arg Val Leu Val Asp Ile Ar - #g Gly Glu Leu Pro Arg          #           350                                                                - Asn Leu Arg Glu Tyr Met Ala Glu Gln Ile Ty - #r Pro Gln Leu Pro Glu          #       365                                                                    - His Leu Lys Glu Ser Phe Leu Glu Ala Ser Gl - #n Asn Gly Arg Leu Arg          #   380                                                                        - Thr Met Pro Ala Ser Phe Leu Pro Pro Ser Se - #r Val Asn Lys Arg Gly          385                 3 - #90                 3 - #95                 4 -        #00                                                                            - Val Leu Ile Leu Gly Asp Ala Tyr Asn Leu Ar - #g His Pro Leu Thr Gly          #               415                                                            - Gly Gly Met Thr Val Ala Leu Lys Asp Ile Ly - #s Leu Trp Arg Gln Leu          #           430                                                                - Leu Lys Asp Ile Pro Asp Leu Tyr Asp Asp Al - #a Ala Ile Phe Gln Ala          #       445                                                                    - Lys Lys Ser Phe Phe Trp Ser Arg Lys Arg Th - #r His Ser Phe Val Val          #   460                                                                        - Asn Val Leu Ala Gln Ala Leu Tyr Glu Leu Ph - #e Ser Ala Thr Asp Asp          465                 4 - #70                 4 - #75                 4 -        #80                                                                            - Ser Leu His Gln Leu Arg Lys Ala Cys Phe Le - #u Tyr Phe Lys Leu Gly          #               495                                                            - Gly Glu Cys Val Thr Gly Pro Val Gly Leu Le - #u Ser Ile Leu Ser Pro          #           510                                                                - His Pro Leu Val Leu Ile Arg His Phe Phe Se - #r Val Ala Ile Tyr Ala          #       525                                                                    - Thr Tyr Phe Cys Phe Lys Ser Glu Pro Trp Al - #a Thr Lys Pro Arg Ala          #   540                                                                        - Leu Phe Ser Ser Gly Ala Val Leu Tyr Lys Al - #a Cys Ser Ile Leu Phe          545                 5 - #50                 5 - #55                 5 -        #60                                                                            - Pro Leu Ile Tyr Ser Glu Met Lys Tyr Leu Va - #l His                          #               570                                                            - (2) INFORMATION FOR SEQ ID NO: 7:                                            -      (i) SEQUENCE CHARACTERISTICS:                                           #acids    (A) LENGTH: 573 amino                                                          (B) TYPE: amino acid                                                           (C) STRANDEDNESS:                                                              (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: protein                                              -    (iii) HYPOTHETICAL: NO                                                    -     (iv) ANTI-SENSE: NO                                                      -     (vi) ORIGINAL SOURCE:                                                              (A) ORGANISM: Rattus no - #rvegicus                                            (F) TISSUE TYPE: kidney                                                        (H) CELL LINE: NRK                                                   -    (vii) IMMEDIATE SOURCE:                                                             (A) LIBRARY: pcD2 libra - #ry of H. Okayama                                    (B) CLONE: Tb-1                                                      -      (x) PUBLICATION INFORMATION:                                            #J.       (A) AUTHORS: Sakakibara,                                             #R.            Watanabe,                                                                      Kanai, R.                                                                      Ono, T.                                                                   (B) TITLE: Molecular cl - #oning and expression of rat                              sqalene e - #poxidase                                           #Chem.    (C) JOURNAL: J. Biol.                                                          (D) VOLUME: 270                                                                (E) ISSUE: 1                                                                   (F) PAGES: 17-20                                                               (G) DATE: 1995                                                       #7:   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:                                    - Met Trp Thr Phe Leu Gly Ile Ala Thr Phe Th - #r Tyr Phe Tyr Lys Lys          #                15                                                            - Cys Gly Asp Val Thr Leu Ala Asn Lys Glu Le - #u Leu Leu Cys Val Leu          #            30                                                                - Val Phe Leu Ser Leu Gly Leu Val Leu Ser Ty - #r Arg Cys Arg His Arg          #        45                                                                    - Asn Gly Gly Leu Leu Gly Arg His Gln Ser Gl - #y Ser Gln Phe Ala Ala          #    60                                                                        - Phe Ser Asp Ile Leu Ser Ala Leu Pro Leu Il - #e Gly Phe Phe Trp Ala          #80                                                                            - Lys Ser Pro Pro Glu Ser Glu Lys Lys Glu Gl - #n Leu Glu Ser Lys Arg          #                95                                                            - Arg Arg Lys Glu Val Asn Leu Ser Glu Thr Th - #r Leu Thr Gly Ala Ala          #           110                                                                - Thr Ser Val Ser Thr Ser Ser Val Thr Asp Pr - #o Glu Val Ile Ile Ile          #       125                                                                    - Gly Ser Gly Val Leu Gly Ser Ala Leu Ala Th - #r Val Leu Ser Arg Asp          #   140                                                                        - Gly Arg Thr Val Thr Val Ile Glu Arg Asp Le - #u Lys Glu Pro Asp Arg          145                 1 - #50                 1 - #55                 1 -        #60                                                                            - Ile Leu Gly Glu Cys Leu Gln Pro Gly Gly Ty - #r Arg Val Leu Arg Glu          #               175                                                            - Leu Gly Leu Gly Asp Thr Val Glu Ser Leu As - #n Ala His His Ile His          #           190                                                                - Gly Tyr Val Ile His Asp Cys Glu Ser Arg Se - #r Glu Val Gln Ile Pro          #       205                                                                    - Tyr Pro Val Ser Glu Asn Asn Gln Val Gln Se - #r Gly Val Ala Phe His          #   220                                                                        - His Gly Lys Phe Ile Met Ser Leu Arg Lys Al - #a Ala Met Ala Glu Pro          225                 2 - #30                 2 - #35                 2 -        #40                                                                            - Asn Val Lys Phe Ile Glu Gly Val Val Leu Ar - #g Leu Leu Glu Glu Asp          #               255                                                            - Asp Ala Val Ile Gly Val Gln Tyr Lys Asp Ly - #s Glu Thr Gly Asp Thr          #           270                                                                - Lys Glu Leu His Ala Pro Leu Thr Val Val Al - #a Asp Gly Leu Phe Ser          #       285                                                                    - Lys Phe Arg Lys Asn Leu Ile Ser Asn Lys Va - #l Ser Val Ser Ser His          #   300                                                                        - Phe Val Gly Phe Ile Met Lys Asp Ala Pro Gl - #n Phe Lys Ala Asn Phe          305                 3 - #10                 3 - #15                 3 -        #20                                                                            - Ala Glu Leu Val Leu Val Asp Pro Ser Pro Va - #l Leu Ile Tyr Gln Ile          #               335                                                            - Ser Pro Ser Glu Thr Arg Val Leu Val Asp Il - #e Arg Gly Glu Leu Pro          #           350                                                                - Arg Asn Leu Arg Glu Tyr Met Thr Glu Gln Il - #e Tyr Pro Gln Ile Pro          #       365                                                                    - Asp His Leu Lys Glu Ser Phe Leu Glu Ala Cy - #s Gln Asn Ala Arg Leu          #   380                                                                        - Arg Thr Met Pro Ala Ser Phe Leu Pro Pro Se - #r Ser Val Asn Lys Arg          385                 3 - #90                 3 - #95                 4 -        #00                                                                            - Gly Val Leu Leu Leu Gly Asp Ala Tyr Asn Le - #u Arg His Pro Leu Thr          #               415                                                            - Gly Gly Gly Met Thr Val Ala Leu Lys Asp Il - #e Lys Ile Trp Arg Gln          #           430                                                                - Leu Leu Lys Asp Ile Pro Asp Leu Tyr Asp As - #p Ala Ala Ile Phe Gln          #       445                                                                    - Ala Lys Lys Ser Phe Phe Trp Ser Arg Lys Ar - #g Ser His Ser Phe Val          #   460                                                                        - Val Asn Val Leu Ala Gln Ala Leu Tyr Glu Le - #u Phe Ser Ala Thr Asp          465                 4 - #70                 4 - #75                 4 -        #80                                                                            - Asp Ser Leu Arg Gln Leu Arg Lys Ala Cys Ph - #e Leu Tyr Phe Lys Leu          #               495                                                            - Gly Gly Glu Cys Leu Thr Gly Pro Val Gly Le - #u Leu Ser Ile Leu Ser          #           510                                                                - Pro Asp Pro Leu Leu Leu Ile Arg His Phe Ph - #e Ser Val Ala Val Tyr          #       525                                                                    - Ala Thr Tyr Phe Cys Phe Lys Ser Glu Pro Tr - #p Ala Thr Lys Pro Arg          #   540                                                                        - Ala Leu Phe Ser Ser Gly Ala Ile Leu Tyr Ly - #s Ala Cys Ser Ile Ile          545                 5 - #50                 5 - #55                 5 -        #60                                                                            - Phe Pro Leu Ile Tyr Ser Glu Met Lys Tyr Le - #u Val His                      #               570                                                            - (2) INFORMATION FOR SEQ ID NO: 8:                                            -      (i) SEQUENCE CHARACTERISTICS:                                           #acids    (A) LENGTH: 496 amino                                                          (B) TYPE: amino acid                                                           (C) STRANDEDNESS:                                                              (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: protein                                              -    (iii) HYPOTHETICAL: NO                                                    -     (iv) ANTI-SENSE: NO                                                      -     (vi) ORIGINAL SOURCE:                                                              (A) ORGANISM: Saccharomyce - #s cerevisiae                                     (B) STRAIN: A2-M8                                                    -      (x) PUBLICATION INFORMATION:                                            #A.       (A) AUTHORS: Jandrositz,                                             #G.            Hoegenauer,                                                     #F.            Turnowsky,                                                      #encoding squalene epoxidase from                                                             Saccharomyce - #s cerevisiae: cloning and                                      characteriza - #tion                                                      (C) JOURNAL: Gene                                                              (D) VOLUME: 107                                                                (F) PAGES: 155-160                                                             (G) DATE: 1991                                                       #8:   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:                                    - Met Ser Ala Val Asn Val Ala Pro Glu Leu Il - #e Asn Ala Asp Asn Thr          #                15                                                            - Ile Thr Tyr Asp Ala Ile Val Ile Gly Ala Gl - #y Val Ile Gly Pro Cys          #            30                                                                - Val Ala Thr Gly Leu Ala Arg Lys Gly Lys Ly - #s Val Leu Ile Val Glu          #        45                                                                    - Arg Asp Trp Ala Met Pro Asp Arg Ile Val Gl - #y Glu Leu Met Gln Pro          #    60                                                                        - Gly Gly Val Arg Ala Leu Arg Ser Leu Gly Me - #t Ile Gln Ser Ile Asn          #80                                                                            - Asn Ile Glu Ala Tyr Pro Val Thr Gly Tyr Th - #r Val Phe Phe Asn Gly          #                95                                                            - Glu Gln Val Asp Ile Pro Tyr Pro Tyr Lys Al - #a Asp Ile Pro Lys Val          #           110                                                                - Glu Lys Leu Lys Asp Leu Val Lys Asp Gly As - #n Asp Lys Val Leu Glu          #       125                                                                    - Asp Ser Thr Ile His Ile Lys Asp Tyr Glu As - #p Asp Glu Arg Glu Arg          #   140                                                                        - Gly Val Ala Phe Val His Gly Arg Phe Leu As - #n Asn Leu Arg Asn Ile          145                 1 - #50                 1 - #55                 1 -        #60                                                                            - Thr Ala Gln Glu Pro Asn Val Thr Arg Val Gl - #n Gly Asn Cys Ile Glu          #               175                                                            - Ile Leu Lys Asp Glu Lys Asn Glu Val Val Gl - #y Ala Lys Val Asp Ile          #           190                                                                - Asp Gly Arg Gly Lys Val Glu Phe Lys Ala Hi - #s Leu Thr Phe Ile Cys          #       205                                                                    - Asp Gly Ile Phe Ser Arg Phe Arg Lys Glu Le - #u His Pro Asp His Val          #   220                                                                        - Pro Thr Val Gly Ser Ser Phe Val Gly Met Se - #r Leu Phe Asn Ala Lys          225                 2 - #30                 2 - #35                 2 -        #40                                                                            - Asn Pro Ala Pro Met His Gly His Val Ile Ph - #e Gly Ser Asp His Met          #               255                                                            - Pro Ile Leu Val Tyr Gln Ile Ser Pro Glu Gl - #u Thr Arg Ile Leu Cys          #           270                                                                - Ala Tyr Asn Ser Pro Lys Val Pro Ala Asp Il - #e Lys Ser Trp Met Ile          #       285                                                                    - Lys Asp Val Gln Pro Phe Ile Pro Lys Ser Le - #u Arg Pro Ser Phe Asp          #   300                                                                        - Glu Ala Val Ser Gln Gly Lys Phe Arg Ala Me - #t Pro Asn Ser Tyr Leu          305                 3 - #10                 3 - #15                 3 -        #20                                                                            - Pro Ala Arg Gln Asn Asp Val Thr Gly Met Cy - #s Val Ile Gly Asp Ala          #               335                                                            - Leu Asn Met Arg His Pro Leu Thr Gly Gly Gl - #y Met Thr Val Gly Leu          #           350                                                                - His Asp Val Val Leu Leu Ile Lys Lys Ile Gl - #y Asp Leu Asp Phe Ser          #       365                                                                    - Asp Arg Glu Lys Val Leu Asp Glu Leu Leu As - #p Tyr His Phe Glu Arg          #   380                                                                        - Lys Ser Tyr Asp Ser Val Ile Asn Val Leu Se - #r Val Ala Leu Tyr Ser          385                 3 - #90                 3 - #95                 4 -        #00                                                                            - Leu Phe Ala Ala Asp Ser Asp Asn Leu Lys Al - #a Leu Gln Lys Gly Cys          #               415                                                            - Phe Lys Tyr Phe Gln Arg Gly Gly Asp Cys Va - #l Asn Lys Pro Val Glu          #           430                                                                - Phe Leu Ser Gly Val Leu Pro Lys Pro Leu Gl - #n Leu Thr Arg Val Phe          #       445                                                                    - Phe Ala Val Ala Phe Tyr Thr Ile Tyr Leu As - #n Met Glu Glu Arg Gly          #   460                                                                        - Phe Leu Gly Leu Pro Met Ala Leu Leu Glu Gl - #y Ile Met Ile Leu Ile          465                 4 - #70                 4 - #75                 4 -        #80                                                                            - Thr Ala Ile Arg Val Phe Thr Pro Phe Leu Ph - #e Gly Glu Leu Ile Gly          #               495                                                            - (2) INFORMATION FOR SEQ ID NO: 9:                                            -      (i) SEQUENCE CHARACTERISTICS:                                           #pairs    (A) LENGTH: 536 base                                                           (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: cDNA to mRNA                                         -    (iii) HYPOTHETICAL: NO                                                    -     (vi) ORIGINAL SOURCE:                                                    #thaliana (A) ORGANISM: Arabidopsis                                                      (B) STRAIN: Columbia                                                 #4 different stages and tissuesGE:                                             -    (vii) IMMEDIATE SOURCE:                                                             (A) LIBRARY: Lambda-PRL2                                                       (B) CLONE: 250F2T7                                                   -      (x) PUBLICATION INFORMATION:                                                      (A) AUTHORS: Newman, T.                                              #F. J.         deBruijn,                                                                      Green, P.                                                       #K.            Keegstra,                                                                      Kende, H.                                                       #L.            McIntosh,                                                       #J.            Ohlrogge,                                                       #N.            Raikhel,                                                        #S.            Somerville,                                                     #M.            Thomashow,                                                                (B) TITLE: Genes galore - #: a summary of methods for                #results from large-scale partial                                              #of anonymous Arabidopsis cDNA clones                                                    (C) JOURNAL: Plant Phys - #iol.                                                (D) VOLUME: 106                                                                (F) PAGES: 1241-1255                                                           (G) DATE: 1994                                                       #9:   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:                                    - GAGAACATAT AAAAGCCATG CCAACAAAGA AGATGACAGC TACTTTGAGC GA - #GAAGAAAG          60                                                                           - GAGTGATTTT ATTGGGAGAT GCATTCAACA TGCGTCATCC AGCAATCGCA TC - #TGGAATGA         120                                                                           - TGGTTTTATT ATCTGACATT CTCATTCTAC GCCGTCTTCT CCAGCCATTA AG - #CAACCTTG         180                                                                           - GCAATGCGCA AAAAATCTCA CAAGTTATCA AGTCCTTTTA TGATATCCGC AA - #GCCAATGT         240                                                                           - CAGCGACAGT TAACACGTTA GGAAATGCAT TCTCTCAAGT GCTAGTTGCA TC - #GACGGACG         300                                                                           - AAGCAAAAGA GGCAATGAGA CAAGGTTGCT ATGATTACCT CTCTAGTGGT GG - #GTTTCGCA         360                                                                           - CGTCAGGGAT GATGGCTTTG CTAGGCGGAT GAACCCTCGT CCGATCTCTC NC - #ATCNANCA         420                                                                           - NCNAGGGGAA CACNCANCCC CATNGGCATC AACNCCNCAT TCCCNNCCCT TC - #GATTGGAA         480                                                                           - CCTCGACTTT TGGTGGNNNA AAGGTGGCCC CCCANGGGAA GGTTCCATNT NT - #CCNC             536                                                                           - (2) INFORMATION FOR SEQ ID NO: 10:                                           -      (i) SEQUENCE CHARACTERISTICS:                                           #pairs    (A) LENGTH: 540 base                                                           (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: double                                                       (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: cDNA to mRNA                                         -    (iii) HYPOTHETICAL: NO                                                    -     (iv) ANTI-SENSE: NO                                                      -     (vi) ORIGINAL SOURCE:                                                              (A) ORGANISM: Ricinus C - #ommunis                                             (B) STRAIN: Baker 296                                                #immature castor fruitsNTAL STAGE:                                                       (F) TISSUE TYPE: endosp - #erm and embryo                            -    (vii) IMMEDIATE SOURCE:                                                             (A) LIBRARY: lambdaZAPST                                                       (B) CLONE: pcrs547                                                   -      (x) PUBLICATION INFORMATION:                                            #Loo, F. J.A) AUTHORS: van de                                                                 Turner, S - #.                                                  #C.            Somerville,                                                               (B) TITLE: Expressed se - #quence tags from developing                              castor se - #eds                                                          (C) JOURNAL: Plant Phys - #iol.                                                (D) VOLUME: 108                                                                (F) PAGES: 1141-1150                                                           (G) DATE: 1995                                                       #10:  (xi) SEQUENCE DESCRIPTION: SEQ ID NO:                                    - TTTGAGCTCA GAGTCACAGA TATAGACATC CTAGGGAAAA CATTCTCCTA TA - #AACTAAAG          60                                                                           - CGTATTACAA TTCACACTTC TTTTCCCCTC AACTTTGATT TGAACAAAGG GA - #TGAGATTA         120                                                                           - AAACCAAAAT GAGAAACGCC CCGTTCCTTC TTGTCACGAA TTTTTCACTC AC - #ATTCTTGT         180                                                                           - CAAACTAATT GCATTCAACA GGAGGAGCTC TATAATATGC TGGGACGGTT GC - #GGGGAAGA         240                                                                           - ACATCTGTCT AACTCCTTCT GCCTTGATAA TGGGGAAGAT GATTCCTGAT GC - #ACCCGATA         300                                                                           - TCAACCTAGC TCCAACCCAG ACGCGCTTAG GTGAAGGGAA TGGCAGTAAC AA - #AGGGGGGG         360                                                                           - CCCGGTACCC AATTTGCCCT ATAGTGAGCC GTATTCAATN ACTGGCCGTT GT - #TTCAACGT         420                                                                           - GTGCCTTGGG AAACCCTGGG GTNCCACTTA TTGCTTCAGA CATCCCCTTT GC - #ANTTGGTA         480                                                                           - TTNGAGGGGC CGACCGTTGC CTCCAANAGT NCNCGTTNAA TTGGGTTGAA AN - #TTNCGGGA         540                                                                           - (2) INFORMATION FOR SEQ ID NO: 11:                                           -      (i) SEQUENCE CHARACTERISTICS:                                           #acids    (A) LENGTH: 503 amino                                                          (B) TYPE: amino acid                                                           (D) TOPOLOGY: linear                                                 -     (ii) MOLECULE TYPE: protein                                              #11:  (xi) SEQUENCE DESCRIPTION: SEQ ID NO:                                    - Met Asp Leu Ala Phe Pro His Val Cys Leu Tr - #p Thr Leu Leu Ala Phe          #                15                                                            - Val Leu Thr Trp Thr Val Phe Tyr Val Asn As - #n Arg Arg Lys Lys Val          #            30                                                                - Ala Lys Leu Pro Asp Ala Ala Thr Glu Val Ar - #g Arg Asp Gly Asp Ala          #        45                                                                    - Asp Val Ile Ile Val Gly Ala Gly Val Gly Gl - #y Ser Ala Leu Ala Tyr          #    60                                                                        - Ala Leu Ala Lys Asp Gly Arg Arg Val His Va - #l Ile Glu Arg Asp Met          #80                                                                            - Arg Glu Pro Val Arg Met Met Gly Glu Phe Me - #t Gln Pro Gly Gly Arg          #                95                                                            - Leu Leu Leu Ser Lys Leu Gly Leu Glu Asp Cy - #s Leu Glu Gly Ile Asp          #           110                                                                - Glu Gln Ile Ala Thr Gly Leu Ala Val Tyr Ly - #s Asp Gly Gln Lys Ala          #       125                                                                    - Leu Val Ser Phe Pro Glu Asp Asn Asp Phe Pr - #o Tyr Glu Pro Thr Gly          #   140                                                                        - Arg Ala Phe Tyr Asn Gly Arg Phe Val Gln Ar - #g Leu Arg Gln Lys Ala          145                 1 - #50                 1 - #55                 1 -        #60                                                                            - Ser Ser Leu Pro Thr Val Gln Leu Glu Glu Gl - #y Thr Val Lys Ser Leu          #               175                                                            - Ile Glu Glu Lys Gly Val Ile Lys Gly Val Th - #r Tyr Lys Asn Ser Ala          #           190                                                                - Gly Glu Glu Thr Thr Ala Phe Ala Pro Leu Th - #r Val Val Cys Asp Gly          #       205                                                                    - Cys Tyr Ser Asn Leu Arg Arg Ser Val Asn As - #p Asn Asn Ala Glu Val          #   220                                                                        - Ile Ser Tyr Gln Val Gly Tyr Val Ser Lys As - #n Cys Gln Leu Glu Asp          225                 2 - #30                 2 - #35                 2 -        #40                                                                            - Pro Glu Lys Leu Lys Leu Ile Met Ser Lys Pr - #o Ser Phe Thr Met Leu          #               255                                                            - Tyr Gln Ile Ser Ser Thr Asp Val Arg Cys Va - #l Met Glu Ile Phe Pro          #           270                                                                - Gly Asn Ile Pro Ser Ile Ser Asn Gly Glu Me - #t Ala Val Tyr Leu Lys          #       285                                                                    - Asn Thr Met Ala Pro Gln Val Pro Pro Glu Le - #u Arg Lys Ile Phe Leu          #   300                                                                        - Lys Gly Ile Asp Glu Gly Ala Gln Ile Lys Al - #a Met Pro Thr Lys Arg          305                 3 - #10                 3 - #15                 3 -        #20                                                                            - Met Glu Ala Thr Leu Ser Glu Lys Gln Gly Va - #l Ile Val Leu Gly Asp          #               335                                                            - Ala Phe Asn Met Arg His Pro Ala Ile Ala Se - #r Gly Met Met Val Val          #           350                                                                - Leu Ser Asp Ile Leu Ile Leu Arg Arg Leu Le - #u Gln Pro Leu Arg Asn          #       365                                                                    - Leu Ser Asp Ala Asn Lys Val Ser Glu Val Il - #e Lys Ser Phe Tyr Val          #   380                                                                        - Ile Arg Lys Pro Met Ser Ala Thr Val Asn Th - #r Leu Gly Asn Ala Phe          385                 3 - #90                 3 - #95                 4 -        #00                                                                            - Ser Gln Val Leu Ile Ala Ser Thr Asp Glu Al - #a Lys Glu Ala Met Arg          #               415                                                            - Gln Gly Cys Phe Asp Tyr Leu Ser Ser Gly Gl - #y Phe Arg Thr Ser Gly          #           430                                                                - Met Met Ala Leu Leu Gly Gly Met Asn Pro Ar - #g Pro Leu Ser Leu Ile          #       445                                                                    - Phe His Leu Cys Gly Ile Thr Leu Ser Ser Il - #e Gly Gln Leu Leu Ser          #   460                                                                        - Pro Phe Pro Ser Pro Leu Gly Ile Trp His Se - #r Leu Arg Leu Phe Gly          465                 4 - #70                 4 - #75                 4 -        #80                                                                            - Val Ser Gln Met Leu Ser Pro Ala Tyr Ala Al - #a Ala Tyr Arg Lys Ser          #               495                                                            - Tyr Met Thr Ala Thr Ala Leu                                                              500                                                                __________________________________________________________________________

REFERENCES

Anonymous(1995) Developments in Calgene's plant oils unit. Biotech. Rep. (July) :3

Altschul, S. F., Gish, W., Miller, W., Myers, E. W. and Lipman, D. J. (1990) Basic local alignment search tool. J. Mol. Biol. 215, 403-410.

Ausubel F M, Brent R, Kingston R E, Moore D D, Seidman J G, Smith J A, Struhl K, Albright L M, Coen D M, Varki A (eds) (1994) Current Protocols in Molecular Biology. John Wiley & Sons.

Barinaga, M. (1993) Ribozymes: killing the messenger. Science 262, 1512-1514.

Bechtold, N., Ellis, J., and Pelletier, G. (1993) In planta Agrobacterium-mediated gene transfer by infiltration of adult Arabidopsis thaliana plants. C R Acad Sci Paris, Sciences de la vie/Life sciences 316: 1194-1199.

Beltz, G. A., Jacobs, K. A., Eickbush, T. H., Cherbas, P. T. and Kafatos, F. C. (1983) Isolation of multigene families and determination of homologies by filter hybridization. Meth. Enzymol. 100, 266-285.

Bevan, M. (1984) Binary Agrobacterium Vectors for Plant Transformation. Nucl. Acids Res. 12, 8711-8721.

Bondioli, P., Mariani, C., Lanzani, A., Fedeli, E., Mossa, A. and Muller, A. (1992) Lampante olive oil refining with supercritical carbon dioxide. J. Am. Oil Chem. Soc. 69, 477-480.

Bondioli, P., Mariani, C., Lanzani, A., Fedeli, E. and Muller, A. (1993) Squalene recovery from olive oil deodorizer distillates. J. Am. Oil Chem. Soc. 70, 763-766.

Bourque, J. E. (1995) Antisense strategies for genetic manipulations in plants. Plant Sci. 105, 125-149.

Chappell, J., Saunders, C. A. and Wolf, F. R. inventors Amoco Corp. (1994) Process and composition for increasing squalene and sterol accumulation in higher plants. U.S. Pat. No. 5349126.

Christou, P. (1993) Particle gun mediated transformation. Curr. Opin. Biotech. 4, 135-141.

Datla, R. S. S., Hammerlindl, J. K., Panchuk, B., Pelcher, L. and Keller, W. (1992) Modified binary plant transformation vectors with the wild-type gene encoding NPTII. Gene 122, 383-384.

DeBlock, M., DeBrouwer, D., and Tenning, P. (1989) Transformation of Brassica napus and Brassica oleracea using Agrobacterium tumefaciens and the expression of the bar and neo genes in the transgenic plants. Plant Physiol. 91: 694-701.

Deprez, P. P., Volkman, J. K. and Davenport, S. R. (1990) Squalene content and neutral lipid composition of livers from deep-sea sharks caught in Tasmanian waters. Aust. J. Mar. Freshwater Res. 41, 375-387.

Goodall, G. J. and Filipowicz, W. (1991) Different effects of intron nucleotide composition and secondary structure on pre-mRNA splicing in monocot and dicot plants. EMBO J. 10, 2635-2644.

Guerineau, F. and Mullineaux, P. (1993) Plant Transformation and Expression Vectors. In: Croy, R. R. D. (Ed.) Plant Molecular Biology Labfax, pp. 121-147. Oxford: Bios Scientific.

Inouye, M. (1993) Regulation of gene expression by employing translational inhibition of mRNA utilizing interfering complementary DNA. U.S. Pat. No. 5,190,931.

Jandrositz, A., Turnowsky, F. and Hoegenauer, G. (1991) The gene encoding squalene epoxidase from Saccharomyces cerevisiae: cloning and characterization. Gene 107, 155-160.

Jorgensen, R. (1990) Altered gene expression in plants due to trans interactions between homologous genes. Trends Biotech. 8, 340-344.

Josefsson, L. -G., Lenman, M., Ericson, M. L. and Rask, L. (1987) Structure of a gene encoding the 1.7S storage protein, napin, from Brassica napus. J. Biol. Chem. 262, 12196-12201.

Kaiya, A. (1990) The use of natural squalene and squalane, and the latest situation of the raw materials. Yukagaku 39, 525-529.

Katavic, V., Haughn, G. W., Reed, D., Martin, M. and Kunst, L. (1994) In planta transformation of Arabidopsis thaliana. Mol. Gen. Genet. 245, 363-370

Katavic, V., Reed, D. W., Taylor, D. C., Giblin, E. M., Barton, D. L., Zou, J., MacKenzie, S. L., Covello, P. S. and Kunst, L. (1995) Alteration of seed fatty acid composition by an ethyl methanesulfonate-induced mutation in Arabidopsis thaliana affecting diacylglycerol acyltransferase activity. Plant Physiol. 108, 399-409.

Koncz, C. and Schell, J. (1986) The promoter of T_(L) -DNA gene 5 controls the tissue-specific expression of chimeric genes by a novel type of Agrobacterium binary vector. Mol. Gen. Genet. 204: 383-396.

Matzke, M. A. and Matzke, A. J. M. (1995) Homology-dependent gene silencing in transgenic plants: what does it really tell us? Trends Genet. 11, 1-3.

Meyer, P.(ed.) (1995) Gene Silencing in Higher Plants and Related Phenomena in Other Eukaryotes, Berlin: Springer.

Moloney, M. M., Walker, J. M. and Sharma, K. K. (1989) High efficiency transformation of Brassica napus using Agrobacterium vectors. Plant Cell Reports, 8: 238-242.

Murphy, D. J. (1996) Engineering oil production in rapeseed and other oil crops. Trends Biotech. 14, 206-213.

Ramamurthi, S. (1994): "Reaction Kinetics and Potential Application of Lipase-catalyzed Esterification of Fatty Acids with Methanol," University of Saskatwhewan, Ph.D.

Sakakibara, J., Watanabe, R., Kanai, Y. and Ono, T. (1995) Molecular cloning and expression of rat squalene epoxidase. J. Biol. Chem. 270, 17-20.

Stam, M., Mol, J. N. M. and Kooter, J. M. (1997) The silence of genes in transgenic plants. Annals of Botany 79, 3-12.

Steinecke, P., Herget, T. and Schreier, P. H. (1992) Expression of a chimeric ribozyme gene results in endonucleolytic cleavage of target mRNA and a concomitant reduction of gene expression in vivo. EMBO J. 11, 1525-1530.

Thomas, C. M., Jagura-Burdzy, G., Williams, D. R., Shah, D. and Thorsted, P. B. (1992) Replication, Maintenance and Transfer of Promiscuous IncP Plasmids. In: Balla, E. (Ed.) DNA Transfer and Gene Expression in Microorganisms, pp. 85-96. Andover: Intercept Ltd.

Wegener, D., Steinecke, P., Herget, T., Petereit, I., Philipp, C. and Schreier, P. H. (1994) Expression of a reporter gene is reduced by a ribozyme in transgenic plants. Mol. Gen. Genet. 245, 465-470.

Wierenga, R. K., Terpstra, P. and Hol, W. G. J. (1986) Prediction of the occurrence of the ADP-binding beta-alpha-beta-fold in proteins, using an amino acid sequence fingerprint. J Mol. Biol. 187, 101-107.

Yamamoto, T. and Kadowaki, Y. (1995) Superfamilies of protooncogenes: homology cloning and characterization of related members. Meth. Enzymol. 254, 169-183.

Yates, P. J., Haughan, P. A., Lenton, J. R. and Goad, L. J. (1991) Effects of terbinafine on growth, squalene, and steryl ester contents of a celery suspension culture. Pesticide Biochem. Physiol. 40, 221-226.

Zhao, J. J. and Pick, L. (1993) Generating loss-of-function phenotypes of the fushi tarazu gene with a targeted ribozyme in Drosophila. Nature 365, 448-451.

The teachings of the above references are specifically incorporated herein by reference. 

What is claimed is:
 1. An isolated and cloned DNA encoding squalene epoxidase having an amino acid sequence with at least 60% similarity to SEQ ID NO:2, SEQ ID NO:4 or SEQ ID NO:11.
 2. The DNA according to claim 1, comprising a sequence having at least 60% identitv to a specific sequence selected from the group consisting of SEQ ID NO:1, SEQ ID NO:3 and SEQ ID NO:5.
 3. An isolated DNA encoding squalene epoxidase having an amino acid sequence selected from the group consisting of SEQ ID NO:2, SEQ ID NO:4 and SEQ ID NO:11.
 4. An isolated DNA having the nucleotide sequence of SEQ ID NO:1.
 5. An isolated DNA having the nucleotide sequence of SEQ ID NO:3.
 6. An isolated DNA having the nucleotide sequence of SEQ ID NO:5.
 7. An isolated and cloned DNA encoding squalene epoxidase from a plant species of the family Brassicaceae.
 8. The plasmid pDR411 (ATCC 97845).
 9. The plasmid pDR111 (ATCC 97846).
 10. The plasmid p129F12T7 (ATCC 97847).
 11. A vector for introducing a nucleotide sequence into a plant genome, wherein said vector comprises a transcriptional promoter, a nucleotide sequence that is antisense to a gene encoding squalene epoxidase having an amino acid sequence with at least 60% similarity to SEQ ID NO:2, SEQ ID NO:4 or SEQ ID NO:11, and a transcription terminator, all operably linked.
 12. The vector according to claim 11, wherein said nucleotide sequence is antisense to a squalene epoxidase gene having at least 60% identity to SEQ ID NO:1, SEQ ID NO:3 and SEQ ID NO:5.
 13. The vector pSE129A (ATCC 97910).
 14. The vector pSE411A (ATCC 97908).
 15. The vector pSE111A (ATCC 97909).
 16. A vector for introducing a nucleotide sequence into a plant genome, wherein said vector comprises a transcriptional promoter, a nucleotide sequence, and a transcription terminator, all operably linked, wherein said nucleotide sequence encodes a squalene epoxidase having an amino acid sequence selected from the group consisting of SEQ ID NO:2, SEQ ID NO:4 and SEQ ID NO:11.
 17. A process of producing a genetically-modified plant, comprising introducing into the genome of a plant at least one heterologous DNA sequence encoding squalene epoxidase having an amino acid sequence with at least 60% similarity to SEQ ID NO:2, SEQ ID NO:4 or SEQ ID NO:11, whereby the plant genome is modified to suppress squalene epoxidase expression by said plant and thereby increase squalene levels above squalene levels of a corresponding wild-type plant.
 18. The process according to claim 17, wherein said at least one heterologous DNA sequence has at least 60% identity to a specific sequence selected from the group consisting of SEQ ID NO:1, SEQ ID NO:3 and SEQ ID NO:5.
 19. The process as claimed in claim 17, wherein said at least one heterologous DNA sequence is in the sense orientation such that said at least one heterologous DNA sequence decreases expression of squalene epoxidase by said plant by co-suppression or homology-dependent gene silencing.
 20. The process as claimed in claim 17, wherein said at least one heterologous DNA sequence is obtained by cloning.
 21. The process according to claim 17, wherein said introducing of said at least one heterologous DNA sequence is carried out by a procedure selected from the group consisting of Agrobacterium-mediated transformation and particle gun transformation.
 22. A process of producing a genetically-modified plant, comprising introducing a nucleotide sequence that reduces or prevents expression of squalene epoxidase into the genome of a plant, wherein said nucleotide sequence comprises a transcriptional promoter and an operably linked sequence antisense to at least one squalene epoxidase messenger RNA encoding squalene epoxidase having an amino acid sequence with at least 60% similarity to SEQ ID NO:2, SEQ ID NO:4 or SEQ ID NO:11, whereby the plant genome is modified to suppress squalene epoxidase expression by said plant and thereby increase squalene levels in the modified plant above squalene levels of corresponding wild-type plants.
 23. The process according to claim 22, wherein said nucleotide sequence has at least 60% identity to SEQ ID NO:1, SEQ ID NO:3 or SEQ ID NO:5.
 24. A genetically-modified plant of the family Brassicaceae that accumulates squalene at levels higher than a corresponding wild-type plant, wherein said genetically-modified plant is produced by the process of claim
 17. 25. A seed of a genetically-modified oilseed plant of the family Brassicaceae containing squalene at levels higher than seeds of corresponding wild-type plants, wherein said seed is from a genetically-modified oilseed plant that is produced by the process of claim
 17. 26. A process of producing squalene, comprising growing the genetically-modified plant as defined in claim 24, harvesting said genetically-modified plant or seeds of said plant, and extracting squalene from said genetically-modified plant or seeds.
 27. A genetically-modified plant of the family Brassicaceae having a genome comprising one or more DNA sequences comprising in operable linkage, a transcriptional promoter, a heterologous gene encoding squalene epoxidase and a transcription terminator, wherein said squalene epoxidase has an amino acid sequence selected from the group consisting of SEQ ID NO:2, SEQ ID NO:4 and SEQ ID NO:11.
 28. The plant of claim 27 being of the genus Arabidopsis.
 29. The plant of claim 27 being of the species Arabidopsis thaliana.
 30. The plant of claim 27 being of the genus Brassica.
 31. The plant of claim 30 being of the species Brassica napus.
 32. A seed of a genetically-modified plant of the family Brassicaceae having a genome comprising one or more DNA sequences comprising in operable linkage, a transcriptional promoter, a heterologous gene encoding squalene epoxidase, and a transcription terminator, wherein said squalene epoxidase has an amino acid sequence selected from the group consisting of SEQ ID NO:2, SEQ ID NO:4 and SEQ ID NO:11.
 33. The seed of claim 32 being of the genus Arabidopsis.
 34. The seed of claim 33 being of the species Arabidopsis thaliana.
 35. The seed of claim 32 being of the genus Brassica.
 36. The seed of claim 35 being of the species Brassica napus.
 37. A process of producing a genetically-modified plant, comprising introducing into the genome of a plant at least one heterologous DNA sequence encoding squalene epoxidase from a plant species of the family Brassicaceae.
 38. A process of producing a genetically-modified plant, comprising introducing a nucleotide sequence that reduces or prevents expression of squalene epoxidase into the genome of a plant, wherein said nucleotide sequence comprises a transcriptional promoter and an operably linked sequence antisense to at least one squalene epoxidase messenger RNA encoding squalene epoxidase from a plant of the family Brassicaceae, whereby the plant genome is modified to suppress squalene epoxidase expression by said plant and thereby increase squalene levels in the modified plant above squalene levels of corresponding wild-type plants.
 39. A genetically-modified plant of the family Brassicaceae that accumulates squalene at levels higher than a corresponding wild-type plant, wherein said genetically-modified plant is produced by the process of claim
 37. 40. A seed of a genetically-modified plant of the family Brassicaceae that accumulates squalene at levels higher than seeds of corresponding wild-type plants, wherein said genetically modified plant is produced by the process of claim
 37. 41. A process of producing squalene, comprising growing the genetically-modified plant of claim 39, harvesting said genetically modified plant or seeds of said plant, and extracting squalene from said genetically-modified plant or seeds. 